As online document collections continue to expand, both on the Web and in proprietary environments, the need for duplicate detection becomes more critical. The goal of this work i...
Nowadays, the issue of near-duplicate video matching has been extensively studied. However, transformation, which is one of the major causes of near-duplicates, has been little di...
Abstract-- Similarity join is a useful primitive operation underlying many applications, such as near duplicate Web page detection, data integration, and pattern recognition. Tradi...
Chuan Xiao, Wei Wang 0011, Xuemin Lin, Haichuan Sh...
Near-duplicate keyframes (NDK) play a unique role in large-scale video search, news topic detection and tracking. In this paper, we propose a novel NDK retrieval approach by explo...
We present a novel approach to managing redundancy in sequence databanks such as GenBank. We store clusters of near-identical sequences as a representative union-sequence and a se...
Michael Cameron, Yaniv Bernstein, Hugh E. Williams