Abstract— In this paper we suggest a new approach to represent text document collections, integrating background knowledge to improve clustering effectiveness. Background knowled...
Clustering by document concepts is a powerful way of retrieving information from a large number of documents. This task in general does not make any assumption on the data distrib...
In this paper, we present a new co-training strategy that makes use of unlabelled data. It trains two predictors in parallel, with each predictor labelling the unlabelled data for...
Under construction… Categories and Subject Descriptors H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval – clustering. General Terms Algorithms, Expe...
Semi-supervised learning methods construct classifiers using both labeled and unlabeled training data samples. While unlabeled data samples can help to improve the accuracy of trai...