Sciweavers

BMCBI
2006

Improved variance estimation of classification performance via reduction of bias caused by small sample size

13 years 11 months ago
Improved variance estimation of classification performance via reduction of bias caused by small sample size
Background: Supervised learning for classification of cancer employs a set of design examples to learn how to discriminate between tumors. In practice it is crucial to confirm that the classifier is robust with good generalization performance to new examples, or at least that it performs better than random guessing. A suggested alternative is to obtain a confidence interval of the error rate using repeated design and test sets selected from available examples. However, it is known that even in the ideal situation of repeated designs and tests with completely novel samples in each cycle, a small test set size leads to a large bias in the estimate of the true variance between design sets. Therefore different methods for small sample performance estimation such as a recently proposed procedure called Repeated Random Sampling (RSS) is also expected to result in heavily biased estimates, which in turn translates into biased confidence intervals. Here we explore such biases and develop a re...
Ulrika Wickenberg-Bolin, Hanna Göransson, M&a
Added 10 Dec 2010
Updated 10 Dec 2010
Type Journal
Year 2006
Where BMCBI
Authors Ulrika Wickenberg-Bolin, Hanna Göransson, Mårten Fryknäs, Mats G. Gustafsson, Anders Isaksson
Comments (0)