Sciweavers

CN
2010

PCIR: Combining DHTs and peer clusters for efficient full-text P2P indexing

14 years 17 days ago
PCIR: Combining DHTs and peer clusters for efficient full-text P2P indexing
Distributed hash tables (DHTs) are very efficient for querying based on key lookups. However, building huge term indexes, as required for IR-style keyword search, poses a scalability challenge for plain DHTs. Due to the large sizes of document term vocabularies, peers joining the network cause huge amounts of key inserts and, consequently, a large number of index maintenance messages. Thus, the key to exploiting DHTs for distributed information retrieval is to reduce index maintenance costs. Various approaches in this direction have been pursued, including the use of hybrid infrastructures, or changing the granularity of the inverted index to peer level. We show that indexing costs can be significantly reduced further by letting peers form groups in a self-organized fashion. Instead of each individual peer submitting index information separately, all peers of a group cooperate to publish the index updates to the DHT in batches. Our evaluation shows that this approach reduces index mai...
Odysseas Papapetrou, Wolf Siberski, Wolfgang Nejdl
Added 09 Dec 2010
Updated 09 Dec 2010
Type Journal
Year 2010
Where CN
Authors Odysseas Papapetrou, Wolf Siberski, Wolfgang Nejdl
Comments (0)