near-duplicate detection

200

SIGIR
2010
ACM

169views Information Technology» more SIGIR 2010»

Efficient partial-duplicate detection based on sequence matching

15 years 1 months ago

With the ever-increasing growth of the Internet, numerous copies of documents become serious problem for search engine, opinion mining and many other web applications. Since parti...

Qi Zhang, Yue Zhang, Haomin Yu, Xuanjing Huang

claim paper

Read More »

179

click to vote

DGO
2006

134views Education» more DGO 2006»

Next steps in near-duplicate detection for eRulemaking

15 years 8 months ago

Download www.cs.cmu.edu

Large volume public comment campaigns and web portals that encourage the public to customize form letters produce many near-duplicate documents, which increases processing and sto...

Hui Yang, Jamie Callan, Stuart W. Shulman

claim paper

Read More »

210

click to vote

MM
2009
ACM

249views Multimedia» more MM 2009»

MyFinder: near-duplicate detection for large image collections

15 years 12 months ago

Download www.uweb.ucsb.edu

The explosive growth of multimedia data poses serious challenges to data storage, management and search. Efficient near-duplicate detection is one of the required technologies for...

Xin Yang, Qiang Zhu, Kwang-Ting Cheng

claim paper

Read More »

207

click to vote

SIGIR
2006
ACM

84views Information Technology» more SIGIR 2006»

Near-duplicate detection by instance-level constrained clustering

16 years 1 months ago

Download www.cs.cmu.edu

For the task of near-duplicated document detection, both traditional fingerprinting techniques used in database community and bag-of-word comparison approaches used in information...

Hui Yang, James P. Callan

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers