Abstract. Feature selection has improved the performance of text clustering. Global feature selection tries to identify a single subset of features which are relevant to all cluste...
Marcelo N. Ribeiro, Manoel J. R. Neto, Ricardo Bas...
We present four new feature selection methods for ordinal regression and test them against four different baselines on two large datasets of product reviews.
Stefano Baccianella, Andrea Esuli, Fabrizio Sebast...
This paper presents a cluster-based text categorization system which uses class distributional clustering of words. We propose a new clustering model which considers the global in...
The Rocchio relevance feedback algorithm is one of the most popular and widely applied learning methods from information retrieval. Here, a probabilistic analysis of this algorith...
Author identification is a text categorization task with applications in intelligence, criminal law, computer forensics, etc. Usually, in such cases there is shortage of training t...