As the amount of online text increases, the demand for text classification to aid the analysis and management of text is increasing. Text is cheap, but information, in the form of...
Chowdhury Mofizur Rahman, Ferdous Ahmed Sohel, Par...
In recent years, active learning methods based on experimental design achieve state-of-the-art performance in text classification applications. Although these methods can exploit ...
Abstract. The number of features to be considered in a text classification system is given by the size of the vocabulary and this is normally in the range of the tens or hundreds o...
David Vilar, Hermann Ney, Alfons Juan, Enrique Vid...
This paper investigates the use of stemming for classification of Dutch (email) texts. We introduce a stemmer, which combines dictionary lookup (implemented efficiently as a finit...
The main problems in text classification are lack of labeled data, as well as the cost of labeling the unlabeled data. We address these problems by exploring co-training - an algo...
The world wide web has a wealth of information that is related to almost any text classification task. This paper presents a method for mining the web to improve text classificati...
We illustrate that Web searches can often be utilized to generate background text for use with text classification. This is the case because there are frequently many pages on the...
Abstract-- Text classification or categorization is a conventional classification problem applied to the text domain. In the cases when statistical classification methods are used,...
Standard Machine Learning approaches to text classification use the bag-of-words representation of documents to deceive the classification target function. Typical linguistic stru...
Abstract. This paper investigates a new extension of the Probabilistic Latent Semantic Analysis (PLSA) model [6] for text classification where the training set is partially labeled...