The problem of sequence categorization is to generalize from a corpus of labeled sequences procedures for accurately labeling future unlabeled sequences. The choice of representat...
We present a simple method for language independent and task independent text categorization learning, based on character-level n-gram language models. Our approach uses simple in...
The ridge logistic regression has successfully been used in text categorization problems and it has been shown to reach the same performance as the Support Vector Machine but with...
PKIP, Patterned Keywords in Phrase, is our feature selection approach to text categorization (TC) for item banks. An item bank is a collection of textual data in which each item c...
Atorn Nuntiyagul, Nick Cercone, Kanlaya Naruedomku...
In this paper we study the effectiveness of using a phrase-based representation in e-mail classification, and the affect this approach has on a number of machine learning algorithm...