Sciweavers

CICLING
2004
Springer

Automatic Learning Features Using Bootstrapping for Text Categorization

14 years 5 months ago
Automatic Learning Features Using Bootstrapping for Text Categorization
When text categorization is applied to complex tasks, it is tedious and expensive to hand-label the large amounts of training data necessary for good performance. In this paper, we put forward an approach to text categorization that requires no labeled documents. The proposed approach automatically learns features using bootstrapping. The input consists of a small set of keywords per class and a large amount of easily obtained unlabeled documents. Using these automatically learned features, we develop a naïve Bayes classifier. The classifier provides 82.8% F1 while classifying a set of web documents into 10 categories, which performs better than naïve Bayes by supervised learning in small number of features cases.
Wenliang Chen, Jingbo Zhu, Honglin Wu, Tianshun Ya
Added 01 Jul 2010
Updated 01 Jul 2010
Type Conference
Year 2004
Where CICLING
Authors Wenliang Chen, Jingbo Zhu, Honglin Wu, Tianshun Yao
Comments (0)