Improved variance estimation of classification performance via reduction of bias caused by small sample size

15 years 6 months ago

Download www.biomedcentral.com

Background: Supervised learning for classification of cancer employs a set of design examples to learn how to discriminate between tumors. In practice it is crucial to confirm that the classifier is robust with good generalization performance to new examples, or at least that it performs better than random guessing. A suggested alternative is to obtain a confidence interval of the error rate using repeated design and test sets selected from available examples. However, it is known that even in the ideal situation of repeated designs and tests with completely novel samples in each cycle, a small test set size leads to a large bias in the estimate of the true variance between design sets. Therefore different methods for small sample performance estimation such as a recently proposed procedure called Repeated Random Sampling (RSS) is also expected to result in heavily biased estimates, which in turn translates into biased confidence intervals. Here we explore such biases and develop a re...

Ulrika Wickenberg-Bolin, Hanna Göransson, M&a

Real-time Traffic

BMCBI 2006 | Confidence Intervals | Design | Variance Estimate |

claim paper

» Multivariate gene selection Does it help

» Stable feature selection via dense feature groups

Post Info
More Details (n/a)

Added	10 Dec 2010
Updated	10 Dec 2010
Type	Journal
Year	2006
Where	BMCBI
Authors	Ulrika Wickenberg-Bolin, Hanna Göransson, Mårten Fryknäs, Mats G. Gustafsson, Anders Isaksson

Comments (0)

Sciweavers

Improved variance estimation of classification performance via reduction of bias caused by small sample size

BMCBI 2006 | Confidence Intervals | Design | Variance Estimate |

Explore & Download

Productivity Tools

Sciweavers