Predicting and Optimizing Classifier Utility with the Power Law

14 years 6 months ago

Download www.ise.bgu.ac.il

When data collection is costly and/or takes a significant amount of time, an early prediction of the classifier performance is extremely important for the design of the data mining process. Power law has been shown in the past to be a good predictor of decisiontree error rates as a function of the sample size. In this paper, we show that the optimal training set size for a given dataset can be computed from a learning curve characterized by a power law. Such a curve can be approximated using a small subset of potentially available data and then used to estimate the expected trade-off between the error rate and the amount of additional observations. The proposed approach to projected optimization of classifier utility is demonstrated and evaluated on several benchmark datasets.

Mark Last

Real-time Traffic

Data Mining | Decisiontree Error Rates | Error Rate | ICDM 2007 | Power Law |

claim paper

Post Info
More Details (n/a)

Added	16 Aug 2010
Updated	16 Aug 2010
Type	Conference
Year	2007
Where	ICDM
Authors	Mark Last

Comments (0)

Sciweavers

Predicting and Optimizing Classifier Utility with the Power Law

Data Mining | Decisiontree Error Rates | Error Rate | ICDM 2007 | Power Law |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers