Sciweavers

MCS
2007
Springer

Random Feature Subset Selection for Ensemble Based Classification of Data with Missing Features

14 years 6 months ago
Random Feature Subset Selection for Ensemble Based Classification of Data with Missing Features
Abstract. We report on our recent progress in developing an ensemble of classifiers based algorithm for addressing the missing feature problem. Inspired in part by the random subspace method, and in part by an AdaBoost type distribution update rule for creating a sequence of classifiers, the proposed algorithm generates an ensemble of classifiers, each trained on a different subset of the available features. Then, an instance with missing features is classified using only those classifiers whose training dataset did not include the currently missing features. Within this framework, we experiment with several bootstrap sampling strategies each using a slightly different distribution update rule. We also analyze the effect of the algorithm’s primary free parameter (the number of features used to train each classifier) on its performance. We show that the algorithm is able to accommodate data with up to 30% missing features, with little or no significant performance drop.
Joseph DePasquale, Robi Polikar
Added 08 Jun 2010
Updated 08 Jun 2010
Type Conference
Year 2007
Where MCS
Authors Joseph DePasquale, Robi Polikar
Comments (0)