Inverted index compression via online document routing

14 years 9 months ago

Download www.cs.yale.edu

Modern search engines are expected to make documents searchable shortly after they appear on the ever changing Web. To satisfy this requirement, the Web is frequently crawled. Due to the sheer size of their indexes, search engines distribute the crawled documents among thousands of servers in a scheme called local index-partitioning, such that each server indexes only several million pages. To ensure documents from the same host (e.g., www.nytimes.com) are distributed uniformly over the servers, for load balancing purposes, random routing of documents to servers is common. To expedite the time documents become searchable after being crawled, documents may be simply appended to the existing index partitions. However, indexing by merely appending documents, results in larger index sizes since document reordering for index compactness is no longer performed. This, in turn, degrades search query processing performance which depends heavily on index sizes. A possible way to balance quick d...

Gal Lavee, Ronny Lempel, Edo Liberty, Oren Somekh

Real-time Traffic

Document | Index Size | Internet Technology | Routing | WWW 2011 |

claim paper

Post Info
More Details (n/a)

Added	17 May 2011
Updated	17 May 2011
Type	Journal
Year	2011
Where	WWW
Authors	Gal Lavee, Ronny Lempel, Edo Liberty, Oren Somekh

Comments (0)

Sciweavers

Inverted index compression via online document routing

Document | Index Size | Internet Technology | Routing | WWW 2011 |

Explore & Download

Productivity Tools

Sciweavers