Sciweavers

BMCBI
2007

Stratification bias in low signal microarray studies

13 years 11 months ago
Stratification bias in low signal microarray studies
Background: When analysing microarray and other small sample size biological datasets, care is needed to avoid various biases. We analyse a form of bias, stratification bias, that can substantially affect analyses using sample-reuse validation techniques and lead to inaccurate results. This bias is due to imperfect stratification of samples in the training and test sets and the dependency between these stratification errors, i.e. the variations in class proportions in the training and test sets are negatively correlated. Results: We show that when estimating the performance of classifiers on low signal datasets (i.e. those which are difficult to classify), which are typical of many prognostic microarray studies, commonly used performance measures can suffer from a substantial negative bias. For error rate this bias is only severe in quite restricted situations, but can be much larger and more frequent when using ranking measures such as the receiver operating characteristic (ROC) curv...
Brian J. Parker, Simon Günter, Justin Bedo
Added 09 Dec 2010
Updated 09 Dec 2010
Type Journal
Year 2007
Where BMCBI
Authors Brian J. Parker, Simon Günter, Justin Bedo
Comments (0)