This paper proposes three novel training methods, two of them based on the back-propagation approach and a third one based on information theory for Multilayer Perceptron (MLP) binary classifiers. Both back-propagation methods are based on the Maximal Margin (MM) principle. The first one, based on the gradient descent with adaptive learning rate algorithm (GDX) and so named Maximum-Margin GDX (MMGDX), directly increases the margin of the MLP outputlayer hyperplane. The proposed method jointly optimizes both MLP layers in a single process, back-propagating the gradient of an MM-based objective function, through the output and hidden layers, in order to create a hidden-layer space that enables a higher margin for the output-layer hyperplane, avoiding the testing of many arbitrary kernels, as occurs in case of SVM training. The proposed MM-based objective function aims to stretch out the margin to its limit. It is also proposed an objective function based on Lp-norm in order to take into ...