Optimizing Near Duplicate Detection for P2P Networks

15 years 5 months ago

Download www.l3s.de

—In this paper, we propose a probabilistic algorithm for detecting near duplicate text, audio, and video resources efﬁciently and effectively in large-scale P2P systems. To this end, we present a thorough cost and probabilistic analysis that allows the algorithm to adapt to network and data collection characteristics for minimizing network cost. In addition, we extend the algorithm so that it can identify similar videos, even if some of the videos are split into different ﬁles. A thorough theoretical analysis as well as a large-scale experimental evaluation on networks of up to 100,000 peers using real-world datasets of more than 200 Gbytes demonstrate the viability of our approach.

Odysseas Papapetrou, Sukriti Ramesh, Stefan Siersd

Real-time Traffic

Algorithm | Communications | Large-scale P2p Systems | P2P 2010 | Thorough Theoretical Analysis |

claim paper

» Robust incentive techniques for peertopeer networks

» GenreAdaptive NearDuplicate Video Segment Detection

» RxIP Monitoring the health of home wireless networks

» NearOptimal Compression of Probabilistic Counting Sketches for Networking Applications

» Fast Duplicate Address Detection for Mobile IPv6

» Using a cache scheme to detect selfish nodes in mobile ad hoc networks

» Sparse data aggregation in sensor networks

» Costeffective outbreak detection in networks

Post Info
More Details (n/a)

Added	29 Jan 2011
Updated	29 Jan 2011
Type	Journal
Year	2010
Where	P2P
Authors	Odysseas Papapetrou, Sukriti Ramesh, Stefan Siersdorfer, Wolfgang Nejdl

Comments (0)

Sciweavers

Optimizing Near Duplicate Detection for P2P Networks

Algorithm | Communications | Large-scale P2p Systems | P2P 2010 | Thorough Theoretical Analysis |

Explore & Download

Productivity Tools

Sciweavers