Sciweavers

JMLR
2012

Deep Learning Made Easier by Linear Transformations in Perceptrons

12 years 2 months ago
Deep Learning Made Easier by Linear Transformations in Perceptrons
We transform the outputs of each hidden neuron in a multi-layer perceptron network to have zero output and zero slope on average, and use separate shortcut connections to model the linear dependencies instead. This transformation aims at separating the problems of learning the linear and nonlinear parts of the whole input-output mapping, which has many benefits. We study the theoretical properties of the transformation by noting that they make the Fisher information matrix closer to a diagonal matrix, and thus standard gradient closer to the natural gradient. We experimentally confirm the usefulness of the transformations by noting that they make basic stochastic gradient learning competitive with state-of-the-art learning algorithms in speed, and that they seem also to help find solutions that generalize better. The experiments include both classification of small images and learning a lowdimensional representation for images by using a deep unsupervised auto-encoder network. The...
Tapani Raiko, Harri Valpola, Yann LeCun
Added 27 Sep 2012
Updated 27 Sep 2012
Type Journal
Year 2012
Where JMLR
Authors Tapani Raiko, Harri Valpola, Yann LeCun
Comments (0)