Query-driven indexing for scalable peer-to-peer text retrieval

15 years 8 months ago

Download lsirpeople.epfl.ch

We present a query-driven algorithm for the distributed indexing of large document collections within structured P2P networks. To cope with bandwidth consumption that has been identiﬁed as the major problem for the standard P2P approach with single term indexing, we leverage a distributed index that stores up to top-k document references only for carefully chosen indexing term combinations. In addition, since the number of possible term combinations extracted from a document collection can be very large, we propose to use query statistics to index only such combinations that are indeed frequently requested by the users. Thus, by avoiding the maintenance of superﬂuous indexing information, we achieve a substantial reduction in bandwidth and storage. A speciﬁc activation mechanism is applied to continuously update the indexing information according to changes in the query distribution, resulting in an eﬃcient, constantly evolving query-driven indexing structure. We show that the...

Gleb Skobeltsyn, Toan Luu, Ivana Podnar Zarko, Mar

Real-time Traffic

Document Collections | Indexing Information | Indexing Term Combinations | Information Management | INFOSCALE 2007 |

claim paper

» A Scalable Indexing Mechanism for OntologyBased Information Integration

» Efficient indexing for large scale visual search

» AlvisP2P scalable peertopeer text retrieval in a structured P2P network

» Leveraging a scalable row store to build a distributed text index

» Scalable Text Retrieval for Large Digital Libraries

» Straightforward Feature Selection for Scalable Latent Semantic Indexing

» SPRITE A LearningBased Text Retrieval System in DHT Networks

» Toward Massive Scalability in Image Matching

Post Info
More Details (n/a)

Added	26 Oct 2010
Updated	26 Oct 2010
Type	Conference
Year	2007
Where	INFOSCALE
Authors	Gleb Skobeltsyn, Toan Luu, Ivana Podnar Zarko, Martin Rajman, Karl Aberer

Comments (0)

Sciweavers

Query-driven indexing for scalable peer-to-peer text retrieval

Document Collections | Indexing Information | Indexing Term Combinations | Information Management | INFOSCALE 2007 |

Explore & Download

Productivity Tools

Sciweavers