An Improved Model Selection Heuristic for AUC

14 years 8 months ago

Download www.cs.bris.ac.uk

Abstract. The area under the ROC curve (AUC) has been widely used to measure ranking performance for binary classiﬁcation tasks. AUC only employs the classiﬁer’s scores to rank the test instances; thus, it ignores other valuable information conveyed by the scores, such as sensitivity to small differences in the score values. However, as such differences are inevitable across samples, ignoring them may lead to overﬁtting the validation set when selecting models with high AUC. This problem is tackled in this paper. On the basis of ranks as well as scores, we introduce a new metric called scored AUC (sAUC), which is the area under the sROC curve. The latter measures how quickly AUC deteriorates if positive scores are decreased. We study the interpretation and statistical properties of sAUC. Experimental results on UCI data sets convincingly demonstrate the effectiveness of the new metric for classiﬁer evaluation and selection in the case of limited validation data.

Shaomin Wu, Peter A. Flach, Cèsar Ferri Ram

Real-time Traffic

Binary Classiﬁcation Tasks | Classiﬁer’s Scores | ECML 2007 | Machine Learning | ROC Curve |

claim paper

Post Info
More Details (n/a)

Added	07 Jun 2010
Updated	07 Jun 2010
Type	Conference
Year	2007
Where	ECML
Authors	Shaomin Wu, Peter A. Flach, Cèsar Ferri Ramirez

Comments (0)

Sciweavers

An Improved Model Selection Heuristic for AUC

Binary Classiﬁcation Tasks | Classiﬁer’s Scores | ECML 2007 | Machine Learning | ROC Curve |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers