Sciweavers

ICML
2004
IEEE

Improving SVM accuracy by training on auxiliary data sources

15 years 1 months ago
Improving SVM accuracy by training on auxiliary data sources
The standard model of supervised learning assumes that training and test data are drawn from the same underlying distribution. This paper explores an application in which a second, auxiliary, source of data is available drawn from a different distribution. This auxiliary data is more plentiful, but of significantly lower quality, than the training and test data. In the SVM framework, a training example has two roles: (a) as a data point to constrain the learning process and (b) as a candidate support vector that can form part of the definition of the classifier. The paper considers using the auxiliary data in either (or both) of these roles. This auxiliary data framework is applied to a problem of classifying images of leaves of maple and oak trees using a kernel derived from the shapes of the leaves. Experiments show that when the training data set is very small, training with auxiliary data can produce large improvements in accuracy, even when the auxiliary data is significantly dif...
Pengcheng Wu, Thomas G. Dietterich
Added 17 Nov 2009
Updated 17 Nov 2009
Type Conference
Year 2004
Where ICML
Authors Pengcheng Wu, Thomas G. Dietterich
Comments (0)