Data de-duplication has become a commodity component in dataintensive systems and it is required that these systems provide high reliability comparable to others. Unfortunately, b...
Chuanyi Liu, Yu Gu, Linchun Sun, Bin Yan, Dongshen...
Duplicate detection determines different representations of realworld objects in a database. Recent research has considered the use of relationships among object representations t...
Detection of near duplicate documents is an important problem in many data mining and information filtering applications. When faced with massive quantities of data, traditional d...
Aleksander Kolcz, Abdur Chowdhury, Joshua Alspecto...
Code duplication is an important problem in application maintenance. Tools exist that support code duplication detection. However, few of them propose a solution for the problem, ...
Distributed peer-to-peer systems rely on voluntary participation of peers to effectively manage a storage pool. In such systems, data is generally replicated for performance and a...
Ronaldo A. Ferreira, Murali Krishna Ramanathan, An...