Sciweavers

MCS
2009
Springer

Random Ordinality Ensembles A Novel Ensemble Method for Multi-valued Categorical Data

14 years 5 months ago
Random Ordinality Ensembles A Novel Ensemble Method for Multi-valued Categorical Data
Abstract. Data with multi-valued categorical attributes can cause major problems for decision trees. The high branching factor can lead to data fragmentation, where decisions have little or no statistical support. In this paper, we propose a new ensemble method, Random Ordinality Ensembles (ROE), that circumvents this problem, and provides significantly improved accuracies over other popular ensemble methods. We perform a random projection of the categorical data into a continuous space by imposing random ordinality on categorical attribute values. A decision tree that learns on this new continuous space is able to use binary splits, hence avoiding the data fragmentation problem. A majority-vote ensemble is then constructed with several trees, each learnt from a different continuous space. An empirical evaluation on 13 datasets shows this simple method to significantly outperform standard techniques such as Boosting and Random Forests. Theoretical study using an information gain fram...
Amir Ahmad, Gavin Brown
Added 26 Jul 2010
Updated 26 Jul 2010
Type Conference
Year 2009
Where MCS
Authors Amir Ahmad, Gavin Brown
Comments (0)