This paper discusses a new type of semi-supervised document clustering that uses partial supervision to partition a large set of documents. Most clustering methods organizes docum...
Document clustering techniques mostly rely on single term analysis of the document data set, such as the Vector Space Model. To better capture the structure of documents, the unde...
We investigate four hierarchical clustering methods (single-link, complete-link, groupwise-average, and single-pass) and two linguistically motivated text features (noun phrase he...
Vasileios Hatzivassiloglou, Luis Gravano, Ankineed...
Document clustering is a very hard task in Automatic Text Processing since it requires to extract regular patterns from a document collection without a priori knowledge on the cat...
Abstract. With information proliferation on the Web, how to obtain highquality information from the Web has been one of hot research topics in many fields like Database, IR as well...