The well-known backpropagation (BP) derivative computation process for multilayer perceptrons (MLP) learning can be viewed as a simplified version of the Kelley-Bryson gradient formula in the classical discrete-time optimal control theory [1]. We detail the derivation in the spirit of dynamic programming, showing how they can serve to implement more elaborate learning whereby teacher signals can be presented to any nodes at any hidden layers, as well as at the terminal output layer. We illustrate such an elaborate training scheme using a small-scale industrial problem as a concrete example, in which some hidden nodes are taught to produce specified target values. In this context, part of the hidden layer is no longer “hidden.”
Eiji Mizutani, Stuart E. Dreyfus, Kenichi Nishio