Co-Validation: Using Model Disagreement on Unlabeled Data to Validate Classification Algorithms

14 years 4 months ago

Download books.nips.cc

In the context of binary classification, we define disagreement as a measure of how often two independently-trained models differ in their classification of unlabeled data. We explore the use of disagreement for error estimation and model selection. We call the procedure co-validation, since the two models effectively (in)validate one another by comparing results on unlabeled data, which we assume is relatively cheap and plentiful compared to labeled data. We show that per-instance disagreement is an unbiased estimate of the variance of error for that instance. We also show that disagreement provides a lower bound on the prediction (generalization) error, and a tight upper bound on the "variance of prediction error", or the variance of the average error across instances, where variance is measured across training sets. We present experimental results on several data sets exploring co-validation for error estimation and model selection. The procedure is especially effective i...

Omid Madani, David M. Pennock, Gary William Flake

Real-time Traffic

Error Estimation | NIPS 2004 | NIPS 2007 | Unlabeled Data | Variance |

claim paper

Post Info
More Details (n/a)

Added	31 Oct 2010
Updated	31 Oct 2010
Type	Conference
Year	2004
Where	NIPS
Authors	Omid Madani, David M. Pennock, Gary William Flake

Comments (0)

Sciweavers

Co-Validation: Using Model Disagreement on Unlabeled Data to Validate Classification Algorithms

Error Estimation | NIPS 2004 | NIPS 2007 | Unlabeled Data | Variance |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers