Sciweavers

50
Voted
IR
2011

Modeling score distributions in information retrieval

13 years 2 months ago
Modeling score distributions in information retrieval
We review the history of modeling score distributions, focusing on the mixture of normal-exponential by investigating the theoretical as well as the empirical evidence supporting its use. We discuss previously suggested conditions which valid binary mixture models should satisfy, such as the Recall-Fallout Convexity Hypothesis, and formulate two new hypotheses considering the component distributions, individually as well as in pairs, under some limiting conditions of parameter values. From all the mixtures suggested in the past, the current theoretical argument points to the two gamma as the most-likely universal model, with the normal-exponential being a usable approximation. Beyond the theoretical contribution, we provide new experimental evidence showing vector space or geometric models, and BM25, as being ‘friendly’ to the normal-exponential, and that the non-convexity problem that the mixture possesses is practically not severe. Furthermore, we review recent non-binary mixture...
Avi Arampatzis, Stephen Robertson
Added 30 Aug 2011
Updated 30 Aug 2011
Type Journal
Year 2011
Where IR
Authors Avi Arampatzis, Stephen Robertson
Comments (0)