In this paper a methodology for feature selection in unsupervised learning is proposed. It makes use of a multiobjective genetic algorithm where the minimization of the number of features and a validity index that measures the quality of clusters have been used to guide the search towards the more discriminant features and the best number of clusters. The proposed strategy is evaluated using two synthetic data sets and then it is applied to handwritten month word recognition. Comprehensive experiments demonstrate the feasibility and efficiency of the proposed methodology.
Marisa E. Morita, Robert Sabourin, Flávio B