Sciweavers

SIGIR
2010
ACM

Score distribution models: assumptions, intuition, and robustness to score manipulation

13 years 11 months ago
Score distribution models: assumptions, intuition, and robustness to score manipulation
Inferring the score distribution of relevant and non-relevant documents is an essential task for many IR applications (e.g. information filtering, recall-oriented IR, meta-search, distributed IR). Modeling score distributions in an accurate manner is the basis of any inference. Thus, numerous score distribution models have been proposed in the literature. Most of the models were proposed on the basis of empirical evidence and goodness-of-fit. In this work, we model score distributions in a rather different, systematic manner. We start with a basic assumption on the distribution of terms in a document. Following the transformations applied on term frequencies by two basic ranking functions, BM25 and Language Models, we derive the distribution of the produced scores for all documents. Then we focus on the relevant documents. We detach our analysis from particular ranking functions. Instead, we consider a model for precision-recall curves, and given this model, we present a general mathe...
Evangelos Kanoulas, Keshi Dai, Virgiliu Pavlu, Jav
Added 06 Dec 2010
Updated 06 Dec 2010
Type Conference
Year 2010
Where SIGIR
Authors Evangelos Kanoulas, Keshi Dai, Virgiliu Pavlu, Javed A. Aslam
Comments (0)