Search Sciweavers | Sciweavers

48 search results - page 3 / 10

» Collection statistics for fast duplicate document detection

152

click to vote

JCB
2007

106views more JCB 2007»

Clustered Sequence Representation for Fast Homology Search

15 years 6 months ago

Download web.udl.es

We present a novel approach to managing redundancy in sequence databanks such as GenBank. We store clusters of near-identical sequences as a representative union-sequence and a se...

Michael Cameron, Yaniv Bernstein, Hugh E. Williams

claim paper

Read More »

163

click to vote

ICMCS
2007
IEEE

149views Multimedia» more ICMCS 2007»

SICO: A System for Detection of Near-Duplicate Images During Search

16 years 1 months ago

Download goanna.cs.rmit.edu.au

Duplicate and near-duplicate digital image matching is beneﬁcial for image search in terms of collection management, digital content protection, and search efﬁciency. In this ...

Jun Jie Foo, Ranjan Sinha, Justin Zobel

claim paper

Read More »

185

Voted

SIGIR
2010
ACM

169views Information Technology» more SIGIR 2010»

Efficient partial-duplicate detection based on sequence matching

15 years 1 months ago

Download homepage.fudan.edu.cn

With the ever-increasing growth of the Internet, numerous copies of documents become serious problem for search engine, opinion mining and many other web applications. Since parti...

Qi Zhang, Yue Zhang, Haomin Yu, Xuanjing Huang

claim paper

Read More »

184

click to vote

ICAIL
2007
ACM

147views Artificial Intelligence» more ICAIL 2007»

Essential deduplication functions for transactional databases in law firms

15 years 10 months ago

Download www.conradweb.org

As massive document repositories and knowledge management systems continue to expand, in proprietary environments as well as on the Web, the need for duplicate detection becomes i...

Jack G. Conrad, Edward L. Raymond

claim paper

Read More »

161

click to vote

INEX
2007
Springer

74views Information Technology» more INEX 2007»

Phrase Detection in the Wikipedia

16 years 24 days ago

Download www.cs.helsinki.fi

The Wikipedia XML collection turned out to be rich of marked-up phrases as we carried out our INEX 2007 experiments. Assuming that a phrase occurs at the inline level of the markup...

Miro Lehtonen, Antoine Doucet

claim paper

Read More »

« Prev « First page 3 / 10 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers