Sciweavers

SIGIR
2008
ACM

Score standardization for inter-collection comparison of retrieval systems

14 years 10 days ago
Score standardization for inter-collection comparison of retrieval systems
The goal of system evaluation in information retrieval has always been to determine which of a set of systems is superior on a given collection. The tool used to determine system ordering is an evaluation metric such as average precision, which computes relative, collection-specific scores. We argue that a broader goal is achievable. In this paper we demonstrate that, by use of standardization, scores can be substantially independent of a particular collection, allowing systems to be compared even when they have been tested on different collections. Compared to current methods, our techniques provide richer information about system performance, improved clarity in outcome reporting, and greater simplicity in reviewing results from disparate sources. Categories and Subject Descriptors H.3.4 [Information Storage and Retrieval]: Systems and software--performance evaluation. Keywords Retrieval experiment, evaluation, average precision, system measurement General Terms Measurement, perform...
William Webber, Alistair Moffat, Justin Zobel
Added 15 Dec 2010
Updated 15 Dec 2010
Type Journal
Year 2008
Where SIGIR
Authors William Webber, Alistair Moffat, Justin Zobel
Comments (0)