The output weight optimization-hidden weight optimization (OWO-HWO) algorithm for training the multilayer perceptron alternately updates the output weights and the hidden weights. This layer-by-layer training strategy greatly improves convergence speed. However, in HWO, the desired net function actually evolves in the gradient direction, which inevitably reduces efficiency. In this paper, two improvements to the OWO-HWO algorithm are presented. New desired net functions are proposed for hidden layer training, which use Hessian matrix information rather than gradients. A weighted hidden layer error function, taking saturation into consideration, is derived directly from the global error function. Both techniques greatly increase training speed. Faster convergence is verified by simulations with remote sensing data sets.
Changhua Yu, Michael T. Manry, Jiang Li