We study methods to initialize or bias different clustering methods using prior information about the "importance" of a keyword w.r.t. the whole document collection or a...
Common document clustering algorithms utilize models that either divide a corpus into smaller clusters or gather individual documents into clusters. Hierarchical Agglomerative Clus...
Assistance in retrieving of documents on the World Wide Web is provided either by search engines, through keyword based queries, or by catalogues, which organise documents into hi...
We propose a novel and efficient solution to the problem of clustering XML documents based on their structure. We use operations on multisets of paths of document trees to define...
Document clustering is useful in many information retrieval tasks: document browsing, organization and viewing of retrieval results, generation of Yahoo-like hierarchies of docume...