We present an approach to document clustering based on winnowing fingerprints that achieved good values of effectiveness with considerable save in memory space and computation tim...
This paper presents a novel formulation and approach to the minimal document set retrieval problem. Minimal Document Set Retrieval (MDSR) is a promising information retrieval task...
Documents in HTML format have many features to analyze, from the terms in special sections to the phrases that appear in the whole document. However, it is important to decide whi...
Among various document clustering algorithms that have been proposed so far, the most useful are those that automatically reveal the number of clusters and assign each target docum...
Eugene Levner, David Pinto, Paolo Rosso, David Alc...
Document clustering is a very hard task in Automatic Text Processing since it requires to extract regular patterns from a document collection without a priori knowledge on the cat...