Sciweavers

WEBI
2004
Springer

Co-training with a Single Natural Feature Set Applied to Email Classification

14 years 4 months ago
Co-training with a Single Natural Feature Set Applied to Email Classification
When dealing with information overload from the Internet, such as the classification of Web pages and the filtering of email spam, a new technique called cotraining has been shown to be a promising approach to help build more accurate classifiers. Co-training allows classifiers to learn with fewer labelled documents by taking advantage of the more abundant unclassified documents. However, conventional co-training requires the dataset to be described by two disjoint and natural feature sets that are sufficiently redundant. In many practical situations, it is not intuitively obvious how to obtain two natural feature sets. This paper shows that when only a single natural feature set is used, the performance of co-training is beneficial in the application of email classification.
Jason Chan, Irena Koprinska, Josiah Poon
Added 02 Jul 2010
Updated 02 Jul 2010
Type Conference
Year 2004
Where WEBI
Authors Jason Chan, Irena Koprinska, Josiah Poon
Comments (0)