Search Sciweavers | Sciweavers

77 search results - page 5 / 16

» Pairwise Document Similarity in Large Collections with MapRe...

click to vote

ICCS
2009
Springer

107views Applied Computing» more ICCS 2009»

Frequent Itemset Mining for Clustering Near Duplicate Web Documents

14 years 2 months ago

Download www.mendeley.com

A vast amount of documents in the Web have duplicates, which is a challenge for developing eﬃcient methods that would compute clusters of similar documents. In this paper we use ...

Dmitry I. Ignatov, Sergei O. Kuznetsov

claim paper

Read More »

click to vote

ICDE
2004
IEEE

151views Database» more ICDE 2004»

Improved File Synchronization Techniques for Maintaining Large Replicated Collections over Slow Networks

14 years 8 months ago

Download cis.poly.edu

We study the problem of maintaining large replicated collections of files or documents in a distributed environment with limited bandwidth. This problem arises in a number of impo...

Torsten Suel, Patrick Noel, Dimitre Trendafilov

claim paper

Read More »

click to vote

SIGIR
2008
ACM

176views Information Technology» more SIGIR 2008»

SpotSigs: robust and efficient near duplicate detection in large web collections

13 years 7 months ago

Download ilpubs.stanford.edu

Motivated by our work with political scientists who need to manually analyze large Web archives of news sites, we present SpotSigs, a new algorithm for extracting and matching sig...

Martin Theobald, Jonathan Siddharth, Andreas Paepc...

claim paper

Read More »

click to vote

DIAL
2006
IEEE

167views Image Analysis» more DIAL 2006»

Tree clustering for layout-based document image retrieval

14 years 1 months ago

Download www.dsi.unifi.it

We describe a system for the retrieval on the basis of layout similarity of document images belonging to collections stored in digital libraries. Layout regions are extracted and ...

Simone Marinai, Emanuele Marino, Giovanni Soda

claim paper

Read More »

click to vote

BMCBI
2008

80views more BMCBI 2008»

Towards an automatic classification of protein structural domains based on structural similarity

13 years 7 months ago

Download www.biomedcentral.com

Background: Formal classification of a large collection of protein structures aids the understanding of evolutionary relationships among them. Classifications involving manual ste...

Vichetra Sam, Chin-Hsien Tai, Jean Garnier, Jean-F...

claim paper

Read More »

« Prev « First page 5 / 16 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers