Text categorization algorithms usually represent documents as bags of words and consequently have to deal with huge numbers of features. Most previous studies found that the major...
Consider a supervised learning problem in which examples contain both numerical- and text-valued features. To use traditional featurevector-based learning methods, one could treat...
Sofus A. Macskassy, Haym Hirsh, Arunava Banerjee, ...
In this paper, we introduce a method that automatically builds text classifiers in a new language by training on already labeled data in another language. Our method transfers the...
We conduct large-scale experiments to investigate optimal features for classification of verbs in biomedical texts. We introduce a range of feature sets and associated extraction ...
Text classification is one of the most actual among the natural language processing problems. In this paper the application of word-based PPM (Prediction by Partial Matching) mode...