Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

122

ADCS
2004

favoriteEmaildiscussreport

249views Applied Computing» more ADCS 2004»

Co-Training on Textual Documents with a Single Natural Feature Set

15 years 3 months ago

Co-Training on Textual Documents with a Single Natural Feature Set

Download www.cs.usyd.edu.au

Co-training is a semi-supervised technique that allows classifiers to learn with fewer labelled documents by taking advantage of the more abundant unclassified documents. However, conventional cotraining requires the dataset to be described by two disjoint and natural feature sets that are redundantly sufficient. In many practical situations datasets have a single set of features and it is not obvious how to split it into two. This paper investigates the performance of co-training with only one natural feature set in two applications: Web page classification and email filtering. Keywords Text categorization, Web page classification, spam filtering, co-training

Jason Chan, Irena Koprinska, Josiah Poon

Real-time Traffic

ADCS 2004 | Applied Computing | Co-training | Natural Feature Set | Web Page Classification |

claim paper

Related Content

» Cotraining with a Single Natural Feature Set Applied to Email Classification

» Document understanding for a broad class of documents

» Knowledge Extraction and Summarization for an Application of Textual CaseBased Interpretat...

» HierarchyRegularized Latent Semantic Indexing

» Multievidence multicriteria lazy associative document classification

» Mining protein function from text using termbased support vector machines

» JASPER an Eclipse plugin to facilitate software maintenance tasks

» Image clustering based on a shared nearest neighbors approach for tagged collections

» Guaranteeing Syntactic Correctness for All Product Line Variants A LanguageIndependent App...

Post Info
More Details (n/a)

Added	30 Oct 2010
Updated	30 Oct 2010
Type	Conference
Year	2004
Where	ADCS
Authors	Jason Chan, Irena Koprinska, Josiah Poon

Comments (0)