Sciweavers

AMT
2006
Springer

Semi-Supervised Text Classification Using Positive and Unlabeled Data

14 years 4 months ago
Semi-Supervised Text Classification Using Positive and Unlabeled Data
Text classification using positive and unlabeled data refers to the problem of building text classifier using positive documents (P) of one class and unlabeled documents (U) of many other classes. U consists of positive and negative documents. Some existing methods for solving the PU-Learning problem are building a classifier in a two-step process. Generally speaking, these existing methods do not perform well when the size of P is too small. In this paper, we propose an improved method aiming at solving the PU-Learning problem with small P. This method combines the graph-based semi-supervised learning with the two-step method. Experiment indicates that our improved method performs well when the size of P is small. Keywords. Text classification, positive and unlabeled data, graph-based method
Shuang Yu, Xueyuan Zhou, Chunping Li
Added 20 Aug 2010
Updated 20 Aug 2010
Type Conference
Year 2006
Where AMT
Authors Shuang Yu, Xueyuan Zhou, Chunping Li
Comments (0)