Building a distributed full-text index for the Web

16 years 7 months ago

Download ilpubs.stanford.edu

We identify crucial design issues in building a distributed inverted index for a large collection of web pages. We introduce a novel pipelining technique for structuring the core index-building system that substantially reduces the index construction time. We also propose a storage scheme for creating and managing inverted files using an embedded database system. We suggest and compare different strategies for collecting global statistics from distributed inverted indexes. Finally, we present performance results from experiments on a testbed distributed indexing system that we have implemented.

Sergey Melnik, Sriram Raghavan, Beverly Yang, Hect

Real-time Traffic

Crucial Design Issues | Embedded Database System | Index Construction Time | Internet Technology | WWW 2001 |

claim paper

» REFEREE An Open Framework for Practical Testing of Recommender Systems using ResearchIndex

» The Anatomy of a LargeScale Hypertextual Web Search Engine

» Index structures and algorithms for querying distributed RDF repositories

» Distributed Digital Library Architecture Incorporating Different Index Styles

» Building a Data Grid for the Australian Nanostructural Analysis Network

» ODISSEA A PeertoPeer Architecture for Scalable Web Search and Information Retrieval

» Evaluation of Join Strategies for Distributed Mediation

» Finding Data Knowledge and Answers on the Semantic Web

Post Info
More Details (n/a)

Added	22 Nov 2009
Updated	22 Nov 2009
Type	Conference
Year	2001
Where	WWW
Authors	Sergey Melnik, Sriram Raghavan, Beverly Yang, Hector Garcia-Molina

Comments (0)

Sciweavers

Building a distributed full-text index for the Web

Crucial Design Issues | Embedded Database System | Index Construction Time | Internet Technology | WWW 2001 |

Explore & Download

Productivity Tools

Sciweavers