Sciweavers

SIGIR
2009
ACM

SUSHI: scoring scaled samples for server selection

14 years 6 months ago
SUSHI: scoring scaled samples for server selection
Modern techniques for distributed information retrieval use a set of documents sampled from each server, but these samples have been underutilised in server selection. We describe a new server selection algorithm, SUSHI, which unlike earlier algorithms can make full use of the text of each sampled document and which does not need training data. SUSHI can directly optimise for many common cases, including high precision retrieval, and by including a simple stopping condition can do so while reducing network traffic. Our experiments compare SUSHI with alternatives and show it achieves the same effectiveness as the best current methods while being substantially more efficient, selecting as few as 20% as many servers. Categories and Subject Descriptors H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval—selection process; H.3.4 [Information Storage and Retrieval]: Systems and Software—distributed systems General Terms Experimentation, Measurement Keywords Docu...
Paul Thomas, Milad Shokouhi
Added 28 May 2010
Updated 28 May 2010
Type Conference
Year 2009
Where SIGIR
Authors Paul Thomas, Milad Shokouhi
Comments (0)