Sciweavers

COLT
1999
Springer

Beating the Hold-Out: Bounds for K-fold and Progressive Cross-Validation

14 years 3 months ago
Beating the Hold-Out: Bounds for K-fold and Progressive Cross-Validation
The empirical error on a test set, the hold-out estimate, often is a more reliable estimate of generalization error than the observed error on the training set, the training estimate. K-fold cross validation is used in practice with the hope of being more accurate than the hold-out estimate without reducing the number of training examples. We argue that the k-fold estimate does in fact achieve this goal. Specifically, we show that for any nontrivial learning problem and learning algorithm that is insensitive to example ordering, the k-fold estimate is strictly more accurate than a single hold-out estimate on 1/k of the data, for ¢¤£¦¥§£©¨ (¥¨ is leave-one-out), based on its variance and all higher moments. Previous bounds were termed sanitycheck because they compared the k-fold estimate to the training estimate and, further, restricted the VC dimension and required a notion of hypothesis stability [2]. In order to avoid these dependencies, we consider a k-fold hypothesi...
Avrim Blum, Adam Kalai, John Langford
Added 03 Aug 2010
Updated 03 Aug 2010
Type Conference
Year 1999
Where COLT
Authors Avrim Blum, Adam Kalai, John Langford
Comments (0)