This paper presents a novel prototype hierarchy based clustering (PHC) framework for the organization of web collections. It solves simultaneously the problem of categorizing web ...
Abstract. A major characteristic of text document categorization problems is the extremely high dimensionality of text data. In this paper we explore the usability of the Oscillati...
Abstract. Several national statistical agencies are now releasing partially synthetic, public use microdata. These comprise the units in the original database with sensitive or ide...
Abstract. Feature subset selection is an important subject when training classifiers in Machine Learning (ML) problems. Too many input features in a ML problem may lead to the so-...
Abstract. Bag-of-words model (BOW) is inspired by the text classification problem, where a document is represented by an unsorted set of contained words. Analogously, in the objec...
Mehdi Mirza-Mohammadi, Sergio Escalera, Petia Rade...