Sciweavers

SIGIR
2002
ACM

Two-stage language models for information retrieval

13 years 11 months ago
Two-stage language models for information retrieval
The optimal settings of retrieval parameters often depend on both the document collection and the query, and are usually found through empirical tuning. In this paper, we propose a family of two-stage language models for information retrieval that explicitly captures the different influences of the query and document collection on the optimal settings of retrieval parameters. As a special case, we present a two-stage smoothing method that allows us to estimate the smoothing parameters completely automatically. In the first stage, the document language model is smoothed using a Dirichlet prior with the collection language model as the reference model. In the second stage, the smoothed document language model is further interpolated with a query background language model. We propose a leave-one-out method for estimating the Dirichlet parameter of the first stage, and the use of document mixture models for estimating the interpolation parameter of the second stage. Evaluation on five dif...
ChengXiang Zhai, John D. Lafferty
Added 23 Dec 2010
Updated 23 Dec 2010
Type Journal
Year 2002
Where SIGIR
Authors ChengXiang Zhai, John D. Lafferty
Comments (0)