Sciweavers

CORR
2010
Springer

MIREX: MapReduce Information Retrieval Experiments

13 years 11 months ago
MIREX: MapReduce Information Retrieval Experiments
We propose to use MapReduce to quickly test new retrieval approaches on a cluster of machines by sequentially scanning all documents. We present a small case study in which we use a cluster of 15 low cost machines to search a web crawl of 0.5 billion pages showing that sequential scanning is a viable approach to running large-scale information retrieval experiments with little effort. The code is available to other researchers at: http://sourceforge.net/projects/mirex/
Djoerd Hiemstra, Claudia Hauff
Added 09 Dec 2010
Updated 09 Dec 2010
Type Journal
Year 2010
Where CORR
Authors Djoerd Hiemstra, Claudia Hauff
Comments (0)