Search Sciweavers | Sciweavers

328 search results - page 16 / 66

» A Multi-level Approach for Document Clustering

186

Voted

ICDM
2007
IEEE

143views Data Mining» more ICDM 2007»

Bit Sequences and Biclustering of Text Documents

16 years 1 months ago

Download www.cs.umb.edu

We propose a new technique for clustering of text documents that relies on a biclustering structure constructed on terms and documents. Our approach makes use of a greedy algorith...

Selim Mimaroglu, Kuniaki Uehara

claim paper

Read More »

211

click to vote

SIGIR
2006
ACM

84views Information Technology» more SIGIR 2006»

Near-duplicate detection by instance-level constrained clustering

16 years 1 months ago

Download www.cs.cmu.edu

For the task of near-duplicated document detection, both traditional fingerprinting techniques used in database community and bag-of-word comparison approaches used in information...

Hui Yang, James P. Callan

claim paper

Read More »

192

click to vote

WIDM
2003
ACM

99views Internet Technology» more WIDM 2003»

Clustering documents in a web directory

16 years 20 days ago

Download sra.itc.it

Hierarchical categorization of documents is a task receiving growing interest due to the widespread proliferation of topic hierarchies for text documents. The worst problem of hie...

Giordano Adami, Paolo Avesani, Diego Sona

claim paper

Read More »

243

click to vote

ICDAR
2011
IEEE

199views Document Analysis» more ICDAR 2011»

Word Retrieval in Historical Document Using Character-Primitives

14 years 7 months ago

Download www.icdar2011.org

Word searching and indexing in historical document collections is a challenging problem because, characters in these documents are often touching or broken due to degradation/agei...

Partha Pratim Roy, Jean-Yves Ramel, Nicolas Ragot

claim paper

Read More »

193

click to vote

ICCS
2009
Springer

107views Applied Computing» more ICCS 2009»

Frequent Itemset Mining for Clustering Near Duplicate Web Documents

16 years 2 months ago

Download www.mendeley.com

A vast amount of documents in the Web have duplicates, which is a challenge for developing eﬃcient methods that would compute clusters of similar documents. In this paper we use ...

Dmitry I. Ignatov, Sergei O. Kuznetsov

claim paper

Read More »

« Prev « First page 16 / 66 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers