An experimental study on two decision issues for wrapper feature selection (FS) with multilayer perceptrons and the sequential backward selection (SBS) procedure is presented. The decision issues studied are the stopping criterion and the network retraining before computing the saliency. Experimental results indicate that the increase in the computational cost associated with retraining the network with every feature temporarily removed before computing the saliency is rewarded with a significant performance improvement. Despite being quite intuitive, this idea has been hardly used in practice. A somehow nonintuitive conclusion can be drawn by looking at the stopping criterion, suggesting that forcing overtraining may be as useful as early stopping. A significant improvement in the overall results with respect to learning with the whole set of variables is observed.
Enrique Romero, Josep M. Sopena