Different from familiar clustering objects, text documents have sparse data spaces. A common way of representing a document is as a bag of its component words, but the semantic re...
In this paper, we propose a novel technique for the efficient prediction of multiple continuous target variables from high-dimensional and heterogeneous data sets using a hierarch...
Aleksandar Lazarevic, Ramdev Kanapady, Chandrika K...
In the context of large databases, data preparation takes a greater importance : instances and explanatory attributes have to be carefully selected. In supervised learning, instanc...
The Correlation Clustering problem, also known as the Cluster Editing problem, seeks to edit a given graph by adding and deleting edges to obtain a collection of disconnected cliq...
Hans L. Bodlaender, Michael R. Fellows, Pinar Hegg...
Text classification using a small labeled set and a large unlabeled data is seen as a promising technique to reduce the labor-intensive and time consuming effort of labeling traini...