Reinforcement Learning, Spike-Time-Dependent Plasticity, and the BCM Rule

14 years 2 days ago

Download eprints.pascal-network.org

Learning agents, whether natural or artiﬁcial, must update their internal parameters in order to improve their behavior over time. In reinforcement learning, this plasticity is inﬂuenced by an environmental signal, termed a reward, which directs the changes in appropriate directions. We apply a recently introduced policy learning algorithm from Machine Learning to networks of spiking neurons, and derive a spike time dependent plasticity rule which ensures convergence to a local optimum of the expected average reward. The approach is applicable to a broad class of neuronal models, including the Hodgkin-Huxley model. We demonstrate the eﬀectiveness of the derived rule in several toy problems. Finally, through statistical analysis we show that the synaptic plasticity rule established is closely related to the widely used BCM rule, for which good biological evidence exists. 1 Policy Learning and Neuronal Dynamics Reinforcement Learning (RL) is a general term used for a class of lear...

Dorit Baras, Ron Meir

Real-time Traffic

Dependent Plasticity Rule | Expected Average Reward | NECO 2007 | Plasticity Rule |

claim paper

Post Info
More Details (n/a)

Added	27 Dec 2010
Updated	27 Dec 2010
Type	Journal
Year	2007
Where	NECO
Authors	Dorit Baras, Ron Meir

Comments (0)

Sciweavers

Reinforcement Learning, Spike-Time-Dependent Plasticity, and the BCM Rule

Dependent Plasticity Rule | Expected Average Reward | NECO 2007 | Plasticity Rule |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers