Sciweavers

CIKM
2010
Springer

Document allocation policies for selective searching of distributed indexes

13 years 10 months ago
Document allocation policies for selective searching of distributed indexes
Indexes for large collections are often divided into shards that are distributed across multiple computers and searched in parallel to provide rapid interactive search. Typically, all index shards are searched for each query. For organizations with modest computational resources the high query processing cost incurred in this exhaustive search setup can be a deterrent to working with large collections. This paper investigates document allocation policies that permit searching only a few shards for each query (selective search) without sacrificing search accuracy. Random, source-based and topic-based document-to-shard allocation policies are studied in the context of selective search. A thorough study of the tradeoff between search cost and search accuracy in a sharded index environment is performed using three large TREC collections. The experimental results demonstrate that selective search using topic-based shards cuts the search cost to less than 1/5th of that of the exhaustive s...
Anagha Kulkarni, Jamie Callan
Added 24 Jan 2011
Updated 24 Jan 2011
Type Journal
Year 2010
Where CIKM
Authors Anagha Kulkarni, Jamie Callan
Comments (0)