This paper introduces a new technique of document clustering based on frequent senses. The proposed system, GDClust (Graph-Based Document Clustering) works with frequent senses ra...
We present a divide-and-merge methodology for clustering a set of objects that combines a top-down "divide" phase with a bottom-up "merge" phase. In contrast, ...
David Cheng, Santosh Vempala, Ravi Kannan, Grant W...
We consider the problem of retrieving multiple documents relevant to the single subtopics of a given web query, termed “full-subtopic retrieval”. To solve this problem we pres...
Andrea Bernardini, Claudio Carpineto, Massimiliano...
Word Sense Disambiguation in text is still a difficult problem as the best supervised methods require laborious and costly manual preparation of training data. Thus, this work focu...
Abstract. A major problem encountered by text clustering practitioners is the difficulty of determining a priori which is the optimal text representation and clustering technique f...