Automatic Learning Features Using Bootstrapping for Text Categorization

14 years 6 months ago

Download www.nlplab.cn

When text categorization is applied to complex tasks, it is tedious and expensive to hand-label the large amounts of training data necessary for good performance. In this paper, we put forward an approach to text categorization that requires no labeled documents. The proposed approach automatically learns features using bootstrapping. The input consists of a small set of keywords per class and a large amount of easily obtained unlabeled documents. Using these automatically learned features, we develop a naïve Bayes classifier. The classifier provides 82.8% F1 while classifying a set of web documents into 10 categories, which performs better than naïve Bayes by supervised learning in small number of features cases.

Wenliang Chen, Jingbo Zhu, Honglin Wu, Tianshun Ya

Real-time Traffic

CICLING 2004 | Naive Bayes | Naive Bayes Classifier | Natural Language Processing | Text Categorization |

claim paper

Post Info
More Details (n/a)

Added	01 Jul 2010
Updated	01 Jul 2010
Type	Conference
Year	2004
Where	CICLING
Authors	Wenliang Chen, Jingbo Zhu, Honglin Wu, Tianshun Yao

Comments (0)

Sciweavers

Automatic Learning Features Using Bootstrapping for Text Categorization

CICLING 2004 | Naive Bayes | Naive Bayes Classifier | Natural Language Processing | Text Categorization |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers