In this paper we study the effectiveness of using a phrase-based representation in e-mail classification, and the affect this approach has on a number of machine learning algorithm...
The problem of sequence categorization is to generalize from a corpus of labeled sequences procedures for accurately labeling future unlabeled sequences. The choice of representat...
We present a simple method for language independent and task independent text categorization learning, based on character-level n-gram language models. Our approach uses simple in...
Filtering feature selection method (filtering method, for short) is a well-known feature selection strategy in pattern recognition and data mining. Filtering method outperforms ot...
Ou Wu, Haiqiang Zuo, Mingliang Zhu, Weiming Hu, Ju...