Towards a Universal Text Classifier: Transfer Learning Using Encyclopedic Knowledge

15 years 4 months ago

Download cs.gmu.edu

Document classification is a key task for many text mining applications. However, traditional text classification requires labeled data to construct reliable and accurate classifiers. Unfortunately, labeled data are seldom available. In this work, we propose a universal text classifier, which does not require any labeled document. Our approach simulates the capability of people to classify documents based on background knowledge. As such, we build a classifier that can effectively group documents based on their content, under the guidance of few words describing the classes of interest. Background knowledge is modeled using encyclopedic knowledge, namely Wikipedia. The universal text classifier can also be used to perform document retrieval. In our experiments with real data we test the feasibility of our approach for both the classification and retrieval tasks. Keywords-Transfer learning; Text classifiers; Wikipedia

Pu Wang, Carlotta Domeniconi

Real-time Traffic

Data Mining | Document | ICDM 2009 | Text Classifiers | Universal Text Classifier |

claim paper

» Towards semantic knowledge propagation from text corpus to web images

» Cross Language Text Classification by Model Translation and SemiSupervised Learning

» Intelligent Search in a Collection of Video Lectures

Post Info
More Details (n/a)

Added	18 Feb 2011
Updated	18 Feb 2011
Type	Journal
Year	2009
Where	ICDM
Authors	Pu Wang, Carlotta Domeniconi

Comments (0)

Sciweavers

Towards a Universal Text Classifier: Transfer Learning Using Encyclopedic Knowledge

Data Mining | Document | ICDM 2009 | Text Classifiers | Universal Text Classifier |

Explore & Download

Productivity Tools

Sciweavers