Using fast weights to improve persistent contrastive divergence

15 years 9 months ago

Download www.cs.mcgill.ca

The most commonly used learning algorithm for restricted Boltzmann machines is contrastive divergence which starts a Markov chain at a data point and runs the chain for only a few iterations to get a cheap, low variance estimate of the suﬃcient statistics under the model. Tieleman (2008) showed that better learning can be achieved by estimating the model’s statistics using a small set of persistent ”fantasy particles” that are not reinitialized to data points after each weight update. With suﬃciently small weight updates, the fantasy particles represent the equilibrium distribution accurately but to explain why the method works with much larger weight updates it is necessary to consider the interaction between the weight updates and the Markov chain. We show that the weight updates force the Markov chain to mix fast, and using this insight we develop an even faster mixing chain that uses an auxiliary set of ”fast weights” to implement a temporary overlay on the energy la...

Tijmen Tieleman, Geoffrey E. Hinton

Real-time Traffic

Data Points | ICML 2009 | Machine Learning | Markov Chain | Weight Updates |

claim paper

Added	19 May 2010
Updated	19 May 2010
Type	Conference
Year	2009
Where	ICML
Authors	Tijmen Tieleman, Geoffrey E. Hinton

Sciweavers

Using fast weights to improve persistent contrastive divergence

Data Points | ICML 2009 | Machine Learning | Markov Chain | Weight Updates |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers