Feature selection is a critical component of many pattern recognition applications. There are two distinct mechanisms for feature selection, namely the wrapper method and the filter methods. The filter methods are generally considered inferior to the wrapper methods since the wrapper methods directly integrate the classifier to be used, and because most filter methods do not directly address feature correlation. Wrapper methods are, however, computationally more demanding than filter methods. One of the popular methods for wrapper-based feature selection is random mutation hill climbing. It performs a random search over the feature space to derive the optimal set of features. We will describe two enhancements to this algorithm, one that will improve its convergence time and the other that will allow us to bias the results towards either higher accuracy or lower feature count. We will apply the algorithm to a real-world massive-scale feature selection problem involving the image classi...
Anil K. Jain, Michael E. Farmer, Shweta Bapna