BCM theory, BCM synaptic modification, or the BCM rule, named for Elie Bienenstock, Leon Cooper, and Paul Munro, is a physical theory of learning in the visual cortex developed in 1981. The BCM model proposes a sliding threshold for long-term potentiation (LTP) or long-term depression (LTD) induction, and states that synaptic plasticity is stabilized by a dynamic adaptation of the time-averaged postsynaptic activity. According to the BCM model, when a pre-synaptic neuron fires, the post-synaptic neurons will tend to undergo LTP if it is in a high-activity state (e.g., is firing at high frequency, and/or has high internal calcium concentrations), or LTD if it is in a lower-activity state (e.g., firing in low frequency, low internal calcium concentrations).[1] This theory is often used to explain how cortical neurons can undergo both LTP or LTD depending on different conditioning stimulus protocols applied to pre-synaptic neurons (usually high-frequency stimulation, or HFS, for LTP, or low-frequency stimulation, LFS, for LTD).[2]
In 1949, Donald Hebb proposed a working mechanism for memory and computational adaption in the brain now called Hebbian learning, or the maxim that cells that fire together, wire together.[3] This notion is foundational in the modern understanding of the brain as a neural network, and though not universally true, remains a good first approximation supported by decades of evidence.[4]
However, Hebb's rule has problems, namely that it has no mechanism for connections to get weaker and no upper bound for how strong they can get. In other words, the model is unstable, both theoretically and computationally. Later modifications gradually improved Hebb's rule, normalizing it and allowing for decay of synapses, where no activity or unsynchronized activity between neurons results in a loss of connection strength. New biological evidence brought this activity to a peak in the 1970s, where theorists formalized various approximations in the theory, such as the use of firing frequency instead of potential in determining neuron excitation, and the assumption of ideal and, more importantly, linear synaptic integration of signals. That is, there is no unexpected behavior in the adding of input currents to determine whether or not a cell will fire.
These approximations resulted in the basic form of BCM below in 1979, but the final step came in the form of mathematical analysis to prove stability and computational analysis to prove applicability, culminating in Bienenstock, Cooper, and Munro's 1982 paper.
Since then, experiments have shown evidence for BCM behavior in both the visual cortex and the hippocampus, the latter of which plays an important role in the formation and storage of memories. Both of these areas are well-studied experimentally, but both theory and experiment have yet to establish conclusive synaptic behavior in other areas of the brain. It has been proposed that in the cerebellum, the parallel-fiber to Purkinje cell synapse follows an "inverse BCM rule", meaning that at the time of parallel fiber activation, a high calcium concentration in the Purkinje cell results in LTD, while a lower concentration results in LTP. Furthermore, the biological implementation for synaptic plasticity in BCM has yet to be established.[5]
The basic BCM rule takes the form
dmj(t) | |
dt |
=\phi(bf{c}(t))dj(t)-\epsilonmj(t),
where:
mj
j
dj
j
c(t)=bf{w}(t)bf{d}(t)=\sumjwj(t)dj(t)
\phi(c)
\thetaM
\phi(c)<0
c<\thetaM
\epsilon
This model is a modified form of the Hebbian learning rule,
mj |
=cdj
\phi
Bienenstock at al. rewrite
\phi(c)
\phi(c,\bar{c})
\bar{c}
c
m |
(t)=\phi(c(t),\bar{c}(t))d(t)
The conditions for stable learning are derived rigorously in BCM noting that with
c(t)=bf{m}(t) ⋅ bf{d}(t)
\bar{c}(t) ≈ bf{m}(t)\bar{d
sgn\phi(c,\bar{c})=sgn\left(c-\left(
\bar{c | |
\phi(0,\bar{c})=0~~rm{for}~rm{all}~\bar{c},
or equivalently, that the threshold
\thetaM(\bar{c})=
p\bar{c} | |
(\bar{c}/c | |
0) |
p
c0
When implemented, the theory is often taken such that
\phi(c,\bar{c})=c(c-\thetaM)~~~rm{and}~~~\thetaM=\bar{c}2=
1 | |
\tau |
t | |
\int | |
-infty |
c2(t\prime)e
-(t-t\prime)/\tau | |
dt\prime,
where
\tau
The model has drawbacks, as it requires both long-term potentiation and long-term depression, or increases and decreases in synaptic strength, something which has not been observed in all cortical systems. Further, it requires a variable activation threshold and depends strongly on stability of the selected fixed points
c0
p
This example is a particular case of the one at chapter "Mathematical results" of Bienenstock at al. work, assuming
p=2
c0=1
\thetaM=(\bar{c}/c
p\bar{c}=\bar{c} | |
0) |
3
\phi(c,\bar{c})=c(c-\thetaM)
Assume two presynaptic neurons that provides inputs
d1
d2
d=(d1,d2)=(0.9,0.1)
d=(0.2,0.7 )
\bar{c}
c
Let initial value of weights
m=(0.1,0.05)
d=(0.9,0.1)
m=(0.1,0.05)
c
\bar{c}
\thetaM=0.001
\phi=0.009
m |
=(0.008,0.001)
m=(0.101,0.051)
In next half of time, inputs are
d=(0.2,0.7 )
m=(0.101,0.051)
c=0.055
\bar{c}
\thetaM=0.000
\phi=0.003
m |
=(0.001,0.002)
m=(0.110,0.055)
Repeating previous cycle we obtain, after several hundred of iterations, that stability is reached with
m=(3.246,-0.927)
c=\sqrt{8}=2.828
c=0.000
\bar{c}=\sqrt{8}/2=1.414
\thetaM=\sqrt{8}=2.828
\phi=0.000
m |
=(0.000,0.000)
Note how, as predicted, the final weight vector
m
c
\phi
The first major experimental confirmation of BCM came in 1992 in investigating LTP and LTD in the hippocampus. Serena Dudek's experimental work showed qualitative agreement with the final form of the BCM activation function.[8] This experiment was later replicated in the visual cortex, which BCM was originally designed to model.[9] This work provided further evidence of the necessity for a variable threshold function for stability in Hebbian-type learning (BCM or others).
Experimental evidence has been non-specific to BCM until Rittenhouse et al. confirmed BCM's prediction of synapse modification in the visual cortex when one eye is selectively closed. Specifically,
log\left( | m\rm(t) |
m\rm(0) |
\right)\sim-\overline{n2}t,
where
\overline{n2}
t
While the algorithm of BCM is too complicated for large-scale parallel distributed processing, it has been put to use in lateral networks with some success.[11] Furthermore, some existing computational network learning algorithms have been made to correspond to BCM learning.[12]