Improvements that don't add up: ad-hoc retrieval results since 1998

14 years 4 months ago

Download ww2.cs.mu.oz.au

The existence and use of standard test collections in information retrieval experimentation allows results to be compared between research groups and over time. Such comparisons, however, are rarely made. Most researchers only report results from their own experiments, a practice that allows lack of overall improvement to go unnoticed. In this paper, we analyze results achieved on the TREC Ad-Hoc, Web, Terabyte, and Robust collections as reported in SIGIR (1998–2008) and CIKM (2004–2008). Dozens of individual published experiments report effectiveness improvements, and often claim statistical signiﬁcance. However, there is little evidence of improvement in ad-hoc retrieval technology over the past decade. Baselines are generally weak, often being below the median original TREC system. And in only a handful of experiments is the score of the best TREC automatic run exceeded. Given this ﬁnding, we question the value of achieving even a statistically signiﬁcant result over a we...

Timothy G. Armstrong, Alistair Moffat, William Web

Real-time Traffic

Ad-hoc Retrieval Technology | CIKM 2009 | Median Original Trec | TREC Automatic Run |

claim paper

Post Info
More Details (n/a)

Added	24 Jul 2010
Updated	24 Jul 2010
Type	Conference
Year	2009
Where	CIKM
Authors	Timothy G. Armstrong, Alistair Moffat, William Webber, Justin Zobel

Comments (0)

Sciweavers

Improvements that don't add up: ad-hoc retrieval results since 1998

Ad-hoc Retrieval Technology | CIKM 2009 | Median Original Trec | TREC Automatic Run |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers