pw dynamics - Online.itp.ucsb.edu

8 downloads 1398 Views 5MB Size Report
Yes (NMDA-block). Detectability changes in behaviour and synaptic efficacy should be correlated. Yes (Whitlock et al.) [Martin, Greenwood, Morris, '00] ...
Weight dependent synaptic plasticity rules

Mark van Rossum Institute for Adaptive and Neural Computation University of Edinburgh, UK

1

Acknowledgements

Guy Billings

Adam Barrett Maria Shippi Cian O'Donnell

2

Hebbian long term plasticity LTP

[Bliss & Lomo '73]

LTD

[O'Connor & Wang '05]

Pairing high pre- and post synaptic activity => Long term potentation Pairing with low activity => Long term depression 3

Synaptic plasticity = memory? [Martin, Greenwood, Morris, '00] Anterograde alteration prevent synaptic plasticity → anterograde amnesia Yes (NMDA-block)



4

AP5 blocks learning

[Morris et al '86]

5

Synaptic plasticity = memory? [Martin, Greenwood, Morris, '00] Anterograde alteration prevent synaptic plasticity → anterograde amnesia Yes (NMDA-block)



Detectability changes in behaviour and synaptic efficacy should be correlated Yes (Whitlock et al.)



6

Synaptic plasticity=memory?

[Whitlock,.. and Bear '06]

7

Synaptic plasticity = memory? [Martin, Greenwood, Morris, '00] Anterograde alteration prevent synaptic plasticity → anterograde amnesia Yes (NMDA-block)



Detectability changes in behaviour and synaptic efficacy should be correlated Yes (Whitlock et al.)



Retrograde alteration alter synaptic efficacies → retrograde amnesia Yes (PKMζ), but...



8

Late LTP maintenance as an active process

ZIP disrupts one month old memory

[Pastalkova et al '06] 9

Synaptic plasticity = memory? [Martin, Greenwood, Morris, '00] Anterograde alteration prevent synaptic plasticity → anterograde amnesia Yes (NMDA-block)



Detectability changes in behaviour and synaptic efficacy should be correlated Yes (Whitlock et al.)



Retrograde alteration alter synaptic efficacies → retrograde amnesia Yes (PKMζ), but...



Mimicry change synaptic efficacies → new ‘apparent’ memory Not quite yet...



10

Computational modelling of synaptic plasticity Ultimate goal: Quantitative, accurate models in health and disease Complicated rules. Plasticity depends on: - pre and post activity, - reward, modulation, history, other synapses, homoeostasis.. - synaptic weight itself

Most models are oversimplified

11

Plasticity due to random patterns: random walk

weight

Random, independent sequence of LTP and LTD

index 12

Synaptic weights divergence

weight Time (steps) 

Diffusion of weights (Sejnowski '77)



Run away, so need bounds on the weights 13

Dealing with synaptic weights divergence Some possible solutions: 

Hard bounds



BCM (*)



Normalization/homeostasis (*)



Non-linear STDP (*)

∑i w i=1 ∑i w i2=1

What is does biology say?  The outcome of the rules depends strongly on the chosen solution... 

(*) Competitive 14

LTP/LTD is weight dependent Long term potentiation

[Debanne '99]

[Montgomery '01]

Long term depression

[Debanne '96]

15

weight

Weight dependent random walk

index 16

Weight dependent learning rules

weight Time (steps)  

P(w)

Weight dependent plasticity prevents run away Leads to realistic weights distributions [MvR et al.'00] 17

Simple model Long term potentiation

[Debanne '99]

Long term depression

[Debanne '96]

Simple description Relative change:  W− =−c 1 ; W

 W  c2 = W W

Absolute change:  W −=−c1 W ;  W =c 2 18

Table of contents



Weight dependent STDP in single neurons and networks



Spine volume dynamics can implement weight dependence



Weight dependence increases information capacity

19

Spike Timing Dependent Plasticity Experimental data

[Bi & Poo 1998]

20

Modelling STDP

Integrate & fire

Poisson trains

Plastic

21

Integrate-and-fire neurons

[Lapicque 1907, Brunel & MvR 2007] 22

Modelling STDP

Integrate & fire

Poisson trains

Plastic −(t post −t pre )/τ−

Δ w=−A− e −(t Δ w=A+ e

pre

−t post )/τ +

23

Modelling STDP

Poisson trains

24

Fokker-Planck approach −(t post −t pre )/τ −

Δ w=−A− e −(t Δ w=A+ e

pre

−t post )/τ +

drift

diffusion

∂ P w ,t  −∂ 1 ∂2 = [ Aw  P  w ,t ] [ D P w ,t ] 2 ∂t ∂w 2 ∂w A(w )=− p d A− + p p A+ 25

Modelling STDP

p p= p d (1+w / Σ w)

26

Fokker-Planck approach −(t post −t pre )/τ −

Δ w=−A− e −(t Δ w=A+ e

pre

−t post )/τ +

drift

diffusion

∂ P w ,t  −∂ 1 ∂2 = [ Aw  P  w ,t ] [ D P w ,t ] 2 ∂t ∂w 2 ∂w A(w )=− p d A− + p p A+

A− =(1+ϵ) A+ 27

Modelling STDP Correlated Poisson trains

 

{

Require hard bounds on weights Competitive

{ {

[Song & Abbott '01]28

However, STDP is weight dependent ('soft bounds')

29

Weight dependence leads to observed weight distribution

[Song et al '05] [MvR, Bi, Turrigiano '00]

30

Data on weight distribution

Note many confounding factors

[Barbour et al. '07] 31

Learning correlations

Similar to Oja's rule. Weakly competitive.

[MvR & Turrigiano '01] 32

Ongoing background activity leads to weight fluctuations

33

Weight dependence leads to volatile memories

Spontaneous activity leads to memory decay



Decay is exponential



Decay is much faster for weight dependent STDP



34

How weight dependence leads to quick forgetting

35

Weight dependence leads to volatile memories

[Billings & MvR '09] 36

Experimental data: erasure by spontaneous activity

V-clamp

Xenopus tectum [Zhou & Poo, '03]

Are memories in networks are unstable? 37

Stability of receptive fields in networks V1

LGN V1-like network  Integrate and fire  Variable lateral inhibition  Sometimes plastic recurrent connections 38

nSTDP: Spontaneous symmetry breaking [Song &Abbott '01] 39

Weight dependent plasticity requires inhibition for selectivity

40

Broad tuning underlies receptive field nSTDP

wSTDP

41

Input tuning in experiments wSTDP

[Jia and Konnerth 2010] 42

Stability of receptive fields Receptive fields

Population vectors

43

Inhibition rescues network stability

[Billings & MvR 2009]

44

Experimental evidence for effect of inhibition on stability Ocular Dominance plasticity regulated by GABA?



[Hensch '05]

Reduced inhibition in auditory plasticity



[Froemke et al 07] 45

Table of contents



Weight dependent STDP in single neurons and networks - The observed weight dependence leads to realistic weight distributions - The receptive fields are much less stable, but lateral inhibition can rescue and modulate retention



Spine dynamics can implement weight dependence



Weight dependence increases information capacity

46

Table of contents



Weight dependent STDP in single neurons and networks



Spine dynamics can implement weight dependence



Weight dependence increases information capacity

47

Biophysical implementation

spine

LTP

AMPA-R

dendrite

Simple model for weight dependence: biophysical saturation 48

Spine morphology is remarkably plastic

[Matsuzaki '04, Glu uncaging] Tight correlation weight and spine volume

49

Three Ca-volume scenarios

[O'Donnell & MvR, submitted]

50

Three scenarios

51

Undercompensating synapses freezes large weights

Note, contrasts with most softbound rules. 52

Large spines are more stable

[from Trachtenberg '02 Supp Info] 53

Biophysical implementation

see also [Kalantzis & Shouval '09] 54

Relation to disease?

[Fiala et al. '02]

[Pan et al. '10]

55

Table of contents



Weight dependent STDP in single neurons and networks



Spine dynamics can affect plasticity rules - Spine morphology likely under-compensates Ca influx - Leads to weight dependent learning rules - Leads to stabilization of large spines



Weight dependence increases information capacity

56

Weight dependent learning and information storage Inputs

1 0 1 . . . 1

0 1 1 . . . 0

0 0 1 . . . 0

x1 x2 x3 . . . xn

Weights w1 w2 w3 . . . wn

Output n

y =∑ a = 1 w a x a



Binary patterns x



Weights are bounded



Ongoing learning, interrupted by recognition test 57

Measuring memory storage capacity Separate learned from novel patterns ('lures') Response in test phase:

P(y)

Characterize with Signal-to-Noise Ratio:

SNR = Neuron's output y

2[〈 y u 〉− 〈 y l 〉]2 Var  y u Var  y l 

58

Ongoing learning: new memories overwrite old ones

age of the pattern Exponential-like decay (but in principle many time-scales) 59

Trade-off: memory strength vs decay

age of the pattern What is better: 

High initial SNR, or slow decay? [Fusi and Abbott '07] 60

Using Shannon information to resolve trade-off How much information about the pattern is gained by inspecting the output? test pattern

response

new

new

old

old

P r∣s I =∑ P r∣s P s log 2 P r  s ,r Always correct ~ 1 bit Chance level ~ 0 bits [Barrett and MvR' 08]

61

Relation between SNR and information

SNR

Independent patterns, Total information per synapse:

Information

1 I syn = N

syn

∑t I t 

Best to store many patterns with low SNR, but what about weight dependence

age of the pattern 62

Optimizing learning rules numerically

In general

 w i = f  x i , y , w But patterns are binary:

  w = f  x = 1, y = const , w  i i −  w = f  x = 0, y = const , w i i 63

Modelling learning ●

Discretize array of possible weights (100 bins)



Learning rule characterized by transition matrices

M  (high input),

and

M − (low input) [Fusi and Amit '02].

   

0 0 1  M = 0 0 0 

0 0 0 1 0 0

0 0 0 0 1 0

0 0 0 0 0 1

0 0 0 0 0 1

0 0 0 0 0 1

1 0 0 − M = 0 0 0

1 0 0 0 0 0

0 1 0 0 0 0

0 0 1 0 0 0

0 0 1 0 0 0

0 0 0 1 0 0

Note, learning not stochastic. 64

Modelling learning

     

1 0 0 M− = 0 0 0

1 0 0 0 0 0

0 1 0 0 0 0

0 0 1 0 0 0

0 0 1 0 0 0

0 0 0 1 0 0

0 1 1 0 0 M− = 0 0 0 0 0 0 0

0 0 0 0 0 0 M− = 1 0 0 0 0 1

65

Modelling learning



Learn from equilibrium weight distribution Potentiation: Depression: Expected update: Signal decay:

 ∞



  M  ∞  M



 ∞

M = pM  1− p M − t

  l t = M   l 0

M  ∞=  ∞ 66

Weight independent learning  w = f  x i = 1, y = const , w  i  w− = f  x = 0, y = const , w  i

i

0 th-order:  w= a

i 1  w− =−a1 i

67

Weight independent learning  w = f  x i = 1, y = const , w  i  w− = f  x = 0, y = const , w  i

i

0 th-order:  w= a

i 1  w− =−a1 i

Isyn=0.047 bit

Optimal learning rule balances LTD against LTP 68

Weight dependent learning increases capacity

1 st -order:

potentiate

 w= a  b w

i 1 1  w− = a 2 b 2 w i

depress

weight

69

Weight dependent learning increases capacity Isyn=0.047 bit  w= a  b w

i 1 1  w− = a 2 b 2 w i

Isyn=0.052 bit

Weight dependent learning increases capacity



Higher order does not further increase capacity (significantly)



70

Restricting to excitatory synapses 0 th-order: Isyn=0.022 bit

71

Restricting to excitatory synapses

Isyn=0.022 bit Isyn=0.025 bit

Using excitatory-only synapses reduces capacity



Weight dependent rule is again better



72

Why does it matter that weights are excitatory?

SNR =

2[〈 y u 〉− 〈 y l 〉]2 Var  y u Var  y l 

Note var  y ∝ var wx  = var  x var wvar  x〈 w 〉 2 var w 〈 x 〉 2 So SNR is better if

〈 w 〉=0

=var  x var wvar  w〈 x 〉 2 73

Increasing capacity by implementing feed-forward inhibition

fixed weights

inhibitory neuron y=∑i w i x i −w inh ∑i x i=∑i  w i −w inh  x i inh So 〈 w eff 〉=〈 w i −w 〉 can be made 0

74

Weight distribution at various levels of inhibition

P(weight)

no inhibition

small inhibition

medium inhibition

full inhibition

weff = weight-winh



Synapses cluster around effective weight 'zero' (balance) 75

Data on weight distribution

[Barbour et al. '07] 76

Further improvement: sparse patterns

SNR=

〈 y u 〉 − 〈 y l 〉2 1 Var  y u  Var  y l    2

Note var  y  ∝ var  wx = var  x var w var w〈 x 〉 2

So SNR is better if 〈 x 〉=0 Use sparse patterns 77

Pattern sparseness increases capacity

Sparse patterns further increase information capacity (Little effect on distributions) 78

Comparison discrete synapses

Discrete synapses? [Petersen '98, O'Connor & Wang '05]



Few synapses: discrete synapses perform well [Barrett, MvR '08]



Decay I syn ∝1/  n syn as transitions are made stochastic [Fusi & Amit '02, Fusi & Abbott '07]

79

P(weight)

Equilibrium distribution for optimal learning depends on # states

weight 80

Table of contents



Weight dependent STDP in single neurons and networks



Spine dynamics can implement weight dependence



Weight dependence increases information capacity - Small, significant increase - Feedforward inhibition and sparseness help - Might also hold in networks [Huang &Amit, in press]

81

Open questions







Why are large spines more stable from a computational viewpoint? Relation to long term stability mechanisms, e.g. protein synthesis, synaptic tagging ? How general are these findings ?

82

Discussion Towards realistic models of synaptic plasticity



Synaptic plasticity is weight dependent: - Realistic weight distribution - Shorter memory time, but is rescued by inhibition - Improves storage capacity



Spine volume dynamics could underlie weight dependence 

83