Document clustering has many uses in natural language tools and applications. For instance, summarizing sets of documents that all describe the same event requires first identifyi...
This paper presents a novel algorithm for document clustering based on a combinatorial framework of the Principal Direction Divisive Partitioning (PDDP) algorithm [1] and a simpli...
This paper presents a new enhanced text extraction algorithm from degraded document images on the basis of the probabilistic models. The observed document image is considered as a...
Text classification using a small labeled set and a large unlabeled data is seen as a promising technique to reduce the labor-intensive and time consuming effort of labeling traini...
eresting web-available abstracts and papers on clustering: An Analysis of Recent Work on Clustering Algorithms (1999), Daniel Fasulo : This paper describes four recent papers on cl...