Sciweavers

SIGIR
2009
ACM

Positional language models for information retrieval

14 years 6 months ago
Positional language models for information retrieval
Although many variants of language models have been proposed for information retrieval, there are two related retrieval heuristics remaining “external” to the language modeling approach: (1) proximity heuristic which rewards a document where the matched query terms occur close to each other; (2) passage retrieval which scores a document mainly based on the best matching passage. Existing studies have only attempted to use a standard language model as a“black box” to implement these heuristics, making it hard to optimize the combination parameters. In this paper, we propose a novel positional language model (PLM) which implements both heuristics in a unified language model. The key idea is to define a language model for each position of a document, and score a document based on the scores of its PLMs. The PLM is estimated based on propagated counts of words within a document through a proximity-based density function, which both captures proximity heuristics and achieves an e...
Yuanhua Lv, ChengXiang Zhai
Added 28 May 2010
Updated 28 May 2010
Type Conference
Year 2009
Where SIGIR
Authors Yuanhua Lv, ChengXiang Zhai
Comments (0)