Sciweavers

ADCS
2004

Co-Training on Textual Documents with a Single Natural Feature Set

14 years 25 days ago
Co-Training on Textual Documents with a Single Natural Feature Set
Co-training is a semi-supervised technique that allows classifiers to learn with fewer labelled documents by taking advantage of the more abundant unclassified documents. However, conventional cotraining requires the dataset to be described by two disjoint and natural feature sets that are redundantly sufficient. In many practical situations datasets have a single set of features and it is not obvious how to split it into two. This paper investigates the performance of co-training with only one natural feature set in two applications: Web page classification and email filtering. Keywords Text categorization, Web page classification, spam filtering, co-training
Jason Chan, Irena Koprinska, Josiah Poon
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2004
Where ADCS
Authors Jason Chan, Irena Koprinska, Josiah Poon
Comments (0)