Sciweavers

NIPS
2008

Generative and Discriminative Learning with Unknown Labeling Bias

14 years 29 days ago
Generative and Discriminative Learning with Unknown Labeling Bias
We apply robust Bayesian decision theory to improve both generative and discriminative learners under bias in class proportions in labeled training data, when the true class proportions are unknown. For the generative case, we derive an entropybased weighting that maximizes expected log likelihood under the worst-case true class proportions. For the discriminative case, we derive a multinomial logistic model that minimizes worst-case conditional log loss. We apply our theory to the modeling of species geographic distributions from presence data, an extreme case of labeling bias since there is no absence data. On a benchmark dataset, we find that entropy-based weighting offers an improvement over constant estimates of class proportions, consistently reducing log loss on unbiased test data.
Miroslav Dudík, Steven J. Phillips
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2008
Where NIPS
Authors Miroslav Dudík, Steven J. Phillips
Comments (0)