Sciweavers

ICDM
2003
IEEE

Exploiting Unlabeled Data for Improving Accuracy of Predictive Data Mining

14 years 5 months ago
Exploiting Unlabeled Data for Improving Accuracy of Predictive Data Mining
Predictive data mining typically relies on labeled data without exploiting a much larger amount of available unlabeled data. The goal of this paper is to show that using unlabeled data can be beneficial in a range of important prediction problems and therefore should be an integral part of the learning process. Given an unlabeled dataset representative of the underlying distribution and a K-class labeled sample that might be biased, our approach is to learn K contrast classifiers each trained to discriminate a certain class of labeled data from the unlabeled population. We illustrate that contrast classifiers can be useful in one-class classification, outlier detection, density estimation, and learning from biased data. The advantages of the proposed approach are demonstrated by an extensive evaluation on synthetic data followed by real-life bioinformatics applications for (1) ranking PubMed articles by their relevance to protein disorder and (2) cost-effective enlargement of a disord...
Kang Peng, Slobodan Vucetic, Bo Han, Hongbo Xie, Z
Added 04 Jul 2010
Updated 04 Jul 2010
Type Conference
Year 2003
Where ICDM
Authors Kang Peng, Slobodan Vucetic, Bo Han, Hongbo Xie, Zoran Obradovic
Comments (0)