A Comparison of Decision Tree Ensemble Creation Techniques

14 years 3 days ago

Download csmr.ca.sandia.gov

Abstract—We experimentally evaluate bagging and seven other randomizationbased approaches to creating an ensemble of decision tree classifiers. Statistical tests were performed on experimental results from 57 publicly available data sets. When cross-validation comparisons were tested for statistical significance, the best method was statistically more accurate than bagging on only eight of the 57 data sets. Alternatively, examining the average ranks of the algorithms across the group of data sets, we find that boosting, random forests, and randomized trees are statistically significantly better than bagging. Because our results suggest that using an appropriate ensemble size is important, we introduce an algorithm that decides when a sufficient number of classifiers has been created for an ensemble. Our algorithm uses the out-of-bag error estimate, and is shown to result in an accurate ensemble for those methods that incorporate bagging into the construction of the ensemble.

Robert E. Banfield, Lawrence O. Hall, Kevin W. Bow

Real-time Traffic

Available Data Sets | Data Sets | Decision Tree Classifiers | PAMI 2007 |

claim paper

Post Info
More Details (n/a)

Added	27 Dec 2010
Updated	27 Dec 2010
Type	Journal
Year	2007
Where	PAMI
Authors	Robert E. Banfield, Lawrence O. Hall, Kevin W. Bowyer, W. Philip Kegelmeyer

Comments (0)

Sciweavers

A Comparison of Decision Tree Ensemble Creation Techniques

Available Data Sets | Data Sets | Decision Tree Classifiers | PAMI 2007 |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers