High-performance document clustering systems enable similar documents to automatically self-organize into groups. In the past, the large amount of computational time needed to clu...
G. Adam Covington, Charles L. G. Comstock, Andrew ...
More and more documents on the World Wide Web are based on templates. On a technical level this causes those documents to have a quite similar source code and DOM tree structure. G...
A novel method for simultaneous keyphrase extraction and generic text summarization is proposed by modeling text documents as weighted undirected and weighted bipartite graphs. Sp...
Search results clustering problem is defined as an automatic, on-line grouping of similar documents in a search hits list, returned from a search engine. In this paper we present t...
This work presents a methodology for grouping structurally similar XML documents using clustering algorithms. Modeling XML documents with tree-like structures, we face the ‘clust...
Theodore Dalamagas, Tao Cheng, Klaas-Jan Winkel, T...