We present a novel sequential clustering algorithm which is motivated by the Information Bottleneck (IB) method. In contrast to the agglomerative IB algorithm, the new sequential ...
This paper discusses a new type of semi-supervised document clustering that uses partial supervision to partition a large set of documents. Most clustering methods organizes docum...
The manipulation of large-scale document data sets often involves the processing of a wealth of features that correspond with the available terms in the document space. The employm...
We present a domain-independent unsupervised topic segmentation approach based on hybrid document indexing. Lexical chains have been successfully employed to evaluate lexical cohe...
In this paper we present an innovative two-stage adaptation approach for handwriting recognition that is based on clustering of similar pages in the training data. In our approach...