Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

165

ECIR
2008
Springer

134views Information Technology» more ECIR 2008»

Probabilistic Document Length Priors for Language Models

15 years 8 months ago

Probabilistic Document Length Priors for Language Models

Download www.dc.fi.udc.es

This paper addresses the issue of devising a new document prior for the language modeling (LM) approach for Information Retrieval. The prior is based on term statistics, derived in a probabilistic fashion and portrays a novel way of considering document length. Furthermore, we developed a new way of combining document length priors with the query likelihood estimation based on the risk of accepting the latter as a score. This prior has been combined with a document retrieval language model that uses Jelinek-Mercer (JM), a smoothing technique which does not take into account document length. The combination of the prior boosts the retrieval performance, so that it outperforms a LM with a document length dependent smoothing component (Dirichlet prior) and other state of the art high-performing scoring function (BM25). Improvements are significant, robust across different collections and query sizes.

Roi Blanco, Alvaro Barreiro

Real-time Traffic

Account Document Length | Document Length | Document Length Priors | ECIR 2008 | Information Technology |

claim paper

Related Content

» Compressionbased document length prior for language models

» Unsupervised Segmentation of Words Using Prior Distributions of Morph Length and Frequency

» An analysis on document length retrieval trends in language modeling smoothing

» The Phylogenetic Indian Buffet Process A NonExchangeable Nonparametric Prior for Latent Fe...

» A Probabilistic Model for Online Document Clustering with Application to Novelty Detection

» Language Models for Searching in Web Corpora

» A new robust relevance model in the language model framework

» Length normalization in XML retrieval

» Webcentric language models

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2008
Where	ECIR
Authors	Roi Blanco, Alvaro Barreiro

Comments (0)