Semi-supervised text categorization by active search

15 years 8 months ago

Download www.cse.cuhk.edu.hk

In automated text categorization, given a small number of labeled documents, it is very challenging, if not impossible, to build a reliable classifier that is able to achieve high classification accuracy. To address this problem, a novel webassisted text categorization framework is proposed in this paper. Important keywords are first automatically identified from the available labeled documents to form the queries. Search engines are then utilized to retrieve from the Web a multitude of relevant documents, which are then exploited by a semi-supervised framework. To our best knowledge, this work is the first study of this kind. Extensive experimental study shows the encouraging results of the proposed text categorization framework: using Google as the web search engine, the proposed framework is able to reduce the classification error by 30% when compared with the stateof-the-art supervised text categorization method. Categories and Subject Descriptors H.3.3 [Information Systems]: Info...

Zenglin Xu, Rong Jin, Kaizhu Huang, Michael R. Lyu

Real-time Traffic

CIKM 2008 | Information Management | Search Engine | Text Categorization | Text Categorization Framework |

claim paper

» Largescale text categorization by batch mode active learning

» Automatic expert identification using a text categorization technique in knowledge managem...

» Active learning via transductive experimental design

Post Info
More Details (n/a)

Added	12 Oct 2010
Updated	12 Oct 2010
Type	Conference
Year	2008
Where	CIKM
Authors	Zenglin Xu, Rong Jin, Kaizhu Huang, Michael R. Lyu, Irwin King

Comments (0)

Sciweavers

Semi-supervised text categorization by active search

CIKM 2008 | Information Management | Search Engine | Text Categorization | Text Categorization Framework |

Explore & Download

Productivity Tools

Sciweavers