Estimating class probabilities in random forests

14 years 1 months ago

Download people.dsv.su.se

For both single probability estimation trees (PETs) and ensembles of such trees, commonly employed class probability estimates correct the observed relative class frequencies in each leaf to avoid anomalies caused by small sample sizes. The effect of such corrections in random forests of PETs is investigated, and the use of the relative class frequency is compared to using two corrected estimates, the Laplace estimate and the m-estimate. An experiment with 34 datasets from the UCI repository shows that estimating class probabilities using relative class frequency clearly outperforms both using the Laplace estimate and the m-estimate with respect to accuracy, area under the ROC curve (AUC) and Brier score. Hence, in contrast to what is commonly employed for PETs and ensembles of PETs, these results strongly suggest that a non-corrected probability estimate should be used in random forests of PETs. The experiment further shows that learning random forests of PETs using relative class fr...

Henrik Boström

Real-time Traffic

ICMLA 2007 | Machine Learning | Random Forests | Relative Class | Relative Class Frequency |

claim paper

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2007
Where	ICMLA
Authors	Henrik Boström

Comments (0)

Sciweavers

Estimating class probabilities in random forests

ICMLA 2007 | Machine Learning | Random Forests | Relative Class | Relative Class Frequency |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers