In this paper we propose and test the use of hierarchical clustering for feature selection. The clustering method is Ward's with a distance measure based on GoodmanKruskal ta...
Abstract Dino Ienco and Rosa Meo Dipartimento di Informatica, Universit`a di Torino, Italy In this paper we propose and test the use of hierarchical clustering for feature selectio...
Abstract. Database systems are islands of structure in a sea of unstructured data sources. Several real-world applications now need to create bridges for smooth integration of semi...
Abstract. The number of features to be considered in a text classification system is given by the size of the vocabulary and this is normally in the range of the tens or hundreds o...
David Vilar, Hermann Ney, Alfons Juan, Enrique Vid...
In many important text classification problems, acquiring class labels for training documents is costly, while gathering large quantities of unlabeled data is cheap. This paper sh...
Kamal Nigam, Andrew McCallum, Sebastian Thrun, Tom...