Efficient Distributed Algorithms to Build Inverted Files

15 years 6 months ago

Download homepages.dcc.ufmg.br

We present three distributed algorithms to build global inverted files for very large text collections. The distributed environment we use is a high bandwidth network of workstations with a shared-nothing memory organization. The text collection is assumed to be evenly distributed among the disks of the various workstations. Our algorithms consider that the total distributed main memory is considerably smaller than the inverted file to be generated. The inverted file is compressed to save memory and disk space and to save time for moving data in/out disk and across the network. We analyze our algorithms and discuss the tradeoffs among them. We show that, with 8 processors and 16 megabytes of RAM available in each processor, the advanced variants of our algorithms are able to invert a 100 gigabytes collection (the size of the very large TREC-7 collection) in roughly 8 hours. Using 16 processors this time drops to roughly 4 hours.

Berthier A. Ribeiro-Neto, Edleno Silva de Moura, M

Real-time Traffic

Algorithms | Information Management | Inverted File | SIGIR 1999 | Text Collection |

claim paper

» Highperformance distributed inverted files

» Parallel Generation of Inverted Files for Distributed Text Collections

» Challenging Ubiquitous Inverted Files

» Load Balancing Distributed Inverted Files Query Ranking

» Building a distributed fulltext index for the Web

» A TreeBased inverted File for Fast RankedDocument Retrieval

» Lowcost management of inverted files for online fulltext search

» HAT a hardware assisted TOPDOC inverted index component

Post Info
More Details (n/a)

Added	03 Aug 2010
Updated	03 Aug 2010
Type	Conference
Year	1999
Where	SIGIR
Authors	Berthier A. Ribeiro-Neto, Edleno Silva de Moura, Marden S. Neubert, Nivio Ziviani

Comments (0)

Sciweavers

Efficient Distributed Algorithms to Build Inverted Files

Algorithms | Information Management | Inverted File | SIGIR 1999 | Text Collection |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers