Text Classification by Labeling Words

14 years 2 months ago

Download www.cs.uic.edu

Traditionally, text classifiers are built from labeled training examples. Labeling is usually done manually by human experts (or the users), which is a labor intensive and time consuming process. In the past few years, researchers investigated various forms of semi-supervised learning to reduce the burden of manual labeling. In this paper, we propose a different approach. Instead of labeling a set of documents, the proposed method labels a set of representative words for each class. It then uses these words to extract a set of documents for each class from a set of unlabeled documents to form the initial training set. The EM algorithm is then applied to build the classifier. The key issue of the approach is how to obtain a set of representative words for each class. One way is to ask the user to provide them, which is difficult because the user usually can only give a few words (which are insufficient for accurate learning). We propose a method to solve the problem. It combines cluste...

Bing Liu, Xiaoli Li, Wee Sun Lee, Philip S. Yu

Real-time Traffic

AAAI 2004 | Intelligent Agents | Manual Labeling | Representative Words | Time Consuming Process |

claim paper

Post Info
More Details (n/a)

Added	30 Oct 2010
Updated	30 Oct 2010
Type	Conference
Year	2004
Where	AAAI
Authors	Bing Liu, Xiaoli Li, Wee Sun Lee, Philip S. Yu

Comments (0)

Sciweavers

Text Classification by Labeling Words

AAAI 2004 | Intelligent Agents | Manual Labeling | Representative Words | Time Consuming Process |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers