This paper is a comparative study of feature selection methods in statistical learning of text categorization. The focus is on aggressive dimensionality reduction. Five methods we...
When linear support vector machines (SVMs) are applied to multi-class text categorization in industry, the size of the linear SVM model is very large, usually greater than several...
In this paper, we propose a new classification method that addresses classification in multiple categories of textual documents. We call it Matrix Regression (MR) due to its resem...
Iulian Sandu Popa, Karine Zeitouni, Georges Gardar...
In many real-world domains, supervised learning requires a large number of training examples. In this paper, we describe an active learning method that uses a committee of learner...
We show that excluding outliers from the training data significantly improves kNN classifier, which in this case performs about 10% better than the best know method--Centroid-based...