Yes (NMDA-block). Detectability changes in behaviour and synaptic efficacy
should be correlated. Yes (Whitlock et al.) [Martin, Greenwood, Morris, '00] ...
Weight dependent synaptic plasticity rules
Mark van Rossum Institute for Adaptive and Neural Computation University of Edinburgh, UK
1
Acknowledgements
Guy Billings
Adam Barrett Maria Shippi Cian O'Donnell
2
Hebbian long term plasticity LTP
[Bliss & Lomo '73]
LTD
[O'Connor & Wang '05]
Pairing high pre- and post synaptic activity => Long term potentation Pairing with low activity => Long term depression 3
Synaptic plasticity = memory? [Martin, Greenwood, Morris, '00] Anterograde alteration prevent synaptic plasticity → anterograde amnesia Yes (NMDA-block)
4
AP5 blocks learning
[Morris et al '86]
5
Synaptic plasticity = memory? [Martin, Greenwood, Morris, '00] Anterograde alteration prevent synaptic plasticity → anterograde amnesia Yes (NMDA-block)
Detectability changes in behaviour and synaptic efficacy should be correlated Yes (Whitlock et al.)
6
Synaptic plasticity=memory?
[Whitlock,.. and Bear '06]
7
Synaptic plasticity = memory? [Martin, Greenwood, Morris, '00] Anterograde alteration prevent synaptic plasticity → anterograde amnesia Yes (NMDA-block)
Detectability changes in behaviour and synaptic efficacy should be correlated Yes (Whitlock et al.)
Retrograde alteration alter synaptic efficacies → retrograde amnesia Yes (PKMζ), but...
8
Late LTP maintenance as an active process
ZIP disrupts one month old memory
[Pastalkova et al '06] 9
Synaptic plasticity = memory? [Martin, Greenwood, Morris, '00] Anterograde alteration prevent synaptic plasticity → anterograde amnesia Yes (NMDA-block)
Detectability changes in behaviour and synaptic efficacy should be correlated Yes (Whitlock et al.)
Retrograde alteration alter synaptic efficacies → retrograde amnesia Yes (PKMζ), but...
Mimicry change synaptic efficacies → new ‘apparent’ memory Not quite yet...
10
Computational modelling of synaptic plasticity Ultimate goal: Quantitative, accurate models in health and disease Complicated rules. Plasticity depends on: - pre and post activity, - reward, modulation, history, other synapses, homoeostasis.. - synaptic weight itself
Most models are oversimplified
11
Plasticity due to random patterns: random walk
weight
Random, independent sequence of LTP and LTD
index 12
Synaptic weights divergence
weight Time (steps)
Diffusion of weights (Sejnowski '77)
Run away, so need bounds on the weights 13
Dealing with synaptic weights divergence Some possible solutions:
Hard bounds
BCM (*)
Normalization/homeostasis (*)
Non-linear STDP (*)
∑i w i=1 ∑i w i2=1
What is does biology say? The outcome of the rules depends strongly on the chosen solution...
(*) Competitive 14
LTP/LTD is weight dependent Long term potentiation
[Debanne '99]
[Montgomery '01]
Long term depression
[Debanne '96]
15
weight
Weight dependent random walk
index 16
Weight dependent learning rules
weight Time (steps)
P(w)
Weight dependent plasticity prevents run away Leads to realistic weights distributions [MvR et al.'00] 17
Simple model Long term potentiation
[Debanne '99]
Long term depression
[Debanne '96]
Simple description Relative change: W− =−c 1 ; W
W c2 = W W
Absolute change: W −=−c1 W ; W =c 2 18
Table of contents
Weight dependent STDP in single neurons and networks
Spine volume dynamics can implement weight dependence
Weight dependence increases information capacity
19
Spike Timing Dependent Plasticity Experimental data
[Bi & Poo 1998]
20
Modelling STDP
Integrate & fire
Poisson trains
Plastic
21
Integrate-and-fire neurons
[Lapicque 1907, Brunel & MvR 2007] 22
Modelling STDP
Integrate & fire
Poisson trains
Plastic −(t post −t pre )/τ−
Δ w=−A− e −(t Δ w=A+ e
pre
−t post )/τ +
23
Modelling STDP
Poisson trains
24
Fokker-Planck approach −(t post −t pre )/τ −
Δ w=−A− e −(t Δ w=A+ e
pre
−t post )/τ +
drift
diffusion
∂ P w ,t −∂ 1 ∂2 = [ Aw P w ,t ] [ D P w ,t ] 2 ∂t ∂w 2 ∂w A(w )=− p d A− + p p A+ 25
Modelling STDP
p p= p d (1+w / Σ w)
26
Fokker-Planck approach −(t post −t pre )/τ −
Δ w=−A− e −(t Δ w=A+ e
pre
−t post )/τ +
drift
diffusion
∂ P w ,t −∂ 1 ∂2 = [ Aw P w ,t ] [ D P w ,t ] 2 ∂t ∂w 2 ∂w A(w )=− p d A− + p p A+
A− =(1+ϵ) A+ 27
Modelling STDP Correlated Poisson trains
{
Require hard bounds on weights Competitive
{ {
[Song & Abbott '01]28
However, STDP is weight dependent ('soft bounds')
29
Weight dependence leads to observed weight distribution
[Song et al '05] [MvR, Bi, Turrigiano '00]
30
Data on weight distribution
Note many confounding factors
[Barbour et al. '07] 31
Learning correlations
Similar to Oja's rule. Weakly competitive.
[MvR & Turrigiano '01] 32
Ongoing background activity leads to weight fluctuations
33
Weight dependence leads to volatile memories
Spontaneous activity leads to memory decay
Decay is exponential
Decay is much faster for weight dependent STDP
34
How weight dependence leads to quick forgetting
35
Weight dependence leads to volatile memories
[Billings & MvR '09] 36
Experimental data: erasure by spontaneous activity
V-clamp
Xenopus tectum [Zhou & Poo, '03]
Are memories in networks are unstable? 37
Stability of receptive fields in networks V1
LGN V1-like network Integrate and fire Variable lateral inhibition Sometimes plastic recurrent connections 38
nSTDP: Spontaneous symmetry breaking [Song &Abbott '01] 39
Weight dependent plasticity requires inhibition for selectivity
40
Broad tuning underlies receptive field nSTDP
wSTDP
41
Input tuning in experiments wSTDP
[Jia and Konnerth 2010] 42
Stability of receptive fields Receptive fields
Population vectors
43
Inhibition rescues network stability
[Billings & MvR 2009]
44
Experimental evidence for effect of inhibition on stability Ocular Dominance plasticity regulated by GABA?
[Hensch '05]
Reduced inhibition in auditory plasticity
[Froemke et al 07] 45
Table of contents
Weight dependent STDP in single neurons and networks - The observed weight dependence leads to realistic weight distributions - The receptive fields are much less stable, but lateral inhibition can rescue and modulate retention
Spine dynamics can implement weight dependence
Weight dependence increases information capacity
46
Table of contents
Weight dependent STDP in single neurons and networks
Spine dynamics can implement weight dependence
Weight dependence increases information capacity
47
Biophysical implementation
spine
LTP
AMPA-R
dendrite
Simple model for weight dependence: biophysical saturation 48
Spine morphology is remarkably plastic
[Matsuzaki '04, Glu uncaging] Tight correlation weight and spine volume
49
Three Ca-volume scenarios
[O'Donnell & MvR, submitted]
50
Three scenarios
51
Undercompensating synapses freezes large weights
Note, contrasts with most softbound rules. 52
Large spines are more stable
[from Trachtenberg '02 Supp Info] 53
Biophysical implementation
see also [Kalantzis & Shouval '09] 54
Relation to disease?
[Fiala et al. '02]
[Pan et al. '10]
55
Table of contents
Weight dependent STDP in single neurons and networks
Spine dynamics can affect plasticity rules - Spine morphology likely under-compensates Ca influx - Leads to weight dependent learning rules - Leads to stabilization of large spines
Weight dependence increases information capacity
56
Weight dependent learning and information storage Inputs
1 0 1 . . . 1
0 1 1 . . . 0
0 0 1 . . . 0
x1 x2 x3 . . . xn
Weights w1 w2 w3 . . . wn
Output n
y =∑ a = 1 w a x a
●
Binary patterns x
●
Weights are bounded
●
Ongoing learning, interrupted by recognition test 57
Measuring memory storage capacity Separate learned from novel patterns ('lures') Response in test phase:
P(y)
Characterize with Signal-to-Noise Ratio:
SNR = Neuron's output y
2[〈 y u 〉− 〈 y l 〉]2 Var y u Var y l
58
Ongoing learning: new memories overwrite old ones
age of the pattern Exponential-like decay (but in principle many time-scales) 59
Trade-off: memory strength vs decay
age of the pattern What is better:
High initial SNR, or slow decay? [Fusi and Abbott '07] 60
Using Shannon information to resolve trade-off How much information about the pattern is gained by inspecting the output? test pattern
response
new
new
old
old
P r∣s I =∑ P r∣s P s log 2 P r s ,r Always correct ~ 1 bit Chance level ~ 0 bits [Barrett and MvR' 08]
61
Relation between SNR and information
SNR
Independent patterns, Total information per synapse:
Information
1 I syn = N
syn
∑t I t
Best to store many patterns with low SNR, but what about weight dependence
age of the pattern 62
Optimizing learning rules numerically
In general
w i = f x i , y , w But patterns are binary:
w = f x = 1, y = const , w i i − w = f x = 0, y = const , w i i 63
Modelling learning ●
Discretize array of possible weights (100 bins)
●
Learning rule characterized by transition matrices
M (high input),
and
M − (low input) [Fusi and Amit '02].
0 0 1 M = 0 0 0
0 0 0 1 0 0
0 0 0 0 1 0
0 0 0 0 0 1
0 0 0 0 0 1
0 0 0 0 0 1
1 0 0 − M = 0 0 0
1 0 0 0 0 0
0 1 0 0 0 0
0 0 1 0 0 0
0 0 1 0 0 0
0 0 0 1 0 0
Note, learning not stochastic. 64
Modelling learning
1 0 0 M− = 0 0 0
1 0 0 0 0 0
0 1 0 0 0 0
0 0 1 0 0 0
0 0 1 0 0 0
0 0 0 1 0 0
0 1 1 0 0 M− = 0 0 0 0 0 0 0
0 0 0 0 0 0 M− = 1 0 0 0 0 1
65
Modelling learning
●
Learn from equilibrium weight distribution Potentiation: Depression: Expected update: Signal decay:
∞
M ∞ M
−
∞
M = pM 1− p M − t
l t = M l 0
M ∞= ∞ 66
Weight independent learning w = f x i = 1, y = const , w i w− = f x = 0, y = const , w i
i
0 th-order: w= a
i 1 w− =−a1 i
67
Weight independent learning w = f x i = 1, y = const , w i w− = f x = 0, y = const , w i
i
0 th-order: w= a
i 1 w− =−a1 i
Isyn=0.047 bit
Optimal learning rule balances LTD against LTP 68
Weight dependent learning increases capacity
1 st -order:
potentiate
w= a b w
i 1 1 w− = a 2 b 2 w i
depress
weight
69
Weight dependent learning increases capacity Isyn=0.047 bit w= a b w
i 1 1 w− = a 2 b 2 w i
Isyn=0.052 bit
Weight dependent learning increases capacity
Higher order does not further increase capacity (significantly)
70
Restricting to excitatory synapses 0 th-order: Isyn=0.022 bit
71
Restricting to excitatory synapses
Isyn=0.022 bit Isyn=0.025 bit
Using excitatory-only synapses reduces capacity
Weight dependent rule is again better
72
Why does it matter that weights are excitatory?
SNR =
2[〈 y u 〉− 〈 y l 〉]2 Var y u Var y l
Note var y ∝ var wx = var x var wvar x〈 w 〉 2 var w 〈 x 〉 2 So SNR is better if
〈 w 〉=0
=var x var wvar w〈 x 〉 2 73
Increasing capacity by implementing feed-forward inhibition
fixed weights
inhibitory neuron y=∑i w i x i −w inh ∑i x i=∑i w i −w inh x i inh So 〈 w eff 〉=〈 w i −w 〉 can be made 0
74
Weight distribution at various levels of inhibition
P(weight)
no inhibition
small inhibition
medium inhibition
full inhibition
weff = weight-winh
Synapses cluster around effective weight 'zero' (balance) 75
Data on weight distribution
[Barbour et al. '07] 76
Further improvement: sparse patterns
SNR=
〈 y u 〉 − 〈 y l 〉2 1 Var y u Var y l 2
Note var y ∝ var wx = var x var w var w〈 x 〉 2
So SNR is better if 〈 x 〉=0 Use sparse patterns 77
Pattern sparseness increases capacity
Sparse patterns further increase information capacity (Little effect on distributions) 78
Comparison discrete synapses
Discrete synapses? [Petersen '98, O'Connor & Wang '05]
Few synapses: discrete synapses perform well [Barrett, MvR '08]
Decay I syn ∝1/ n syn as transitions are made stochastic [Fusi & Amit '02, Fusi & Abbott '07]
79
P(weight)
Equilibrium distribution for optimal learning depends on # states
weight 80
Table of contents
Weight dependent STDP in single neurons and networks
Spine dynamics can implement weight dependence
Weight dependence increases information capacity - Small, significant increase - Feedforward inhibition and sparseness help - Might also hold in networks [Huang &Amit, in press]
81
Open questions
Why are large spines more stable from a computational viewpoint? Relation to long term stability mechanisms, e.g. protein synthesis, synaptic tagging ? How general are these findings ?
82
Discussion Towards realistic models of synaptic plasticity
Synaptic plasticity is weight dependent: - Realistic weight distribution - Shorter memory time, but is rescued by inhibition - Improves storage capacity
Spine volume dynamics could underlie weight dependence
83