Sciweavers

CIKM
2005
Springer

Fast on-line index construction by geometric partitioning

14 years 1 months ago
Fast on-line index construction by geometric partitioning
Inverted index structures are the mainstay of modern text retrieval systems. They can be constructed quickly using off-line mergebased methods, and provide efficient support for a variety of querying modes. In this paper we examine the task of on-line index construction – that is, how to build an inverted index when the underlying data must be continuously queryable, and the documents must be indexed and available for search as soon they are inserted. When straightforward approaches are used, document insertions become increasingly expensive as the size of the database grows. This paper describes a mechanism based on controlled partitioning that can be adapted to suit different balances of insertion and querying operations, and is faster and scales better than previous methods. Using experiments on 100 GB of web data we demonstrate the efficiency of our methods in practice, showing that they dramatically reduce the cost of on-line index construction. Categories and Subject Descrip...
Nicholas Lester, Alistair Moffat, Justin Zobel
Added 13 Oct 2010
Updated 13 Oct 2010
Type Conference
Year 2005
Where CIKM
Authors Nicholas Lester, Alistair Moffat, Justin Zobel
Comments (0)