Learning Curves for Stochastic Gradient Descent in Linear Feedforward Networks

15 years 3 months ago

Download books.nips.cc

Gradient-following learning methods can encounter problems of implementation in many applications, and stochastic variants are frequently used to overcome these difﬁculties. We derive quantitative learning curves for three online training methods used with a linear perceptron: direct gradient descent, node perturbation, and weight perturbation. The maximum learning rate for the stochastic methods scales inversely with the ﬁrst power of the dimensionality of the noise injected into the system; with sufﬁciently small learning rate, all three methods give identical learning curves. These results suggest guidelines for when these stochastic methods will be limited in their utility, and considerations for architectures in which they will be effective.

Justin Werfel, Xiaohui Xie, H. Sebastian Seung

Real-time Traffic

Learning Curves | NIPS 2003 | NIPS 2007 | Stochastic Methods | Stochastic Methods Scales |

claim paper

Added	31 Oct 2010
Updated	31 Oct 2010
Type	Conference
Year	2003
Where	NIPS
Authors	Justin Werfel, Xiaohui Xie, H. Sebastian Seung

Sciweavers

Learning Curves for Stochastic Gradient Descent in Linear Feedforward Networks

Learning Curves | NIPS 2003 | NIPS 2007 | Stochastic Methods | Stochastic Methods Scales |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers