Ensemble of Classifiers (EoC) has been shown effective in improving the performance of single classifiers by combining their outputs. By using diverse data subsets to train classifiers, the ensemble creation methods can create diverse classifiers for the EoC. In this work, we propose a scheme to measure the data diversity directly from random subspaces and we explore the possibility of using the data diversity directly to select the best data subsets for the construction of the EoC. The applicability is tested on NIST SD19 handwritten numerals.
Albert Hung-Ren Ko, Robert Sabourin, Luiz E. Soare