Estimation of statistical translation models based on mutual information for ad hoc information retrieval

14 years 7 months ago

Download sifaka.cs.uiuc.edu

As a principled approach to capturing semantic relations of words in information retrieval, statistical translation models have been shown to outperform simple document language models which rely on exact matching of words in the query and documents. A main challenge in applying translation models to ad hoc information retrieval is to estimate a translation model without training data. Existing work has relied on training on synthetic queries generated based on a document collection. However, this method is computationally expensive and does not have a good coverage of query words. In this paper, we propose an alternative way to estimate a translation model based on normalized mutual information between words, which is less computationally expensive and has better coverage of query words than the synthetic query method of estimation. We also propose to regularize estimated translation probabilities to ensure suﬃcient probability mass for self-translation. Experiment results show tha...

Maryam Karimzadehgan, ChengXiang Zhai

Real-time Traffic

Improve Retrieval Accuracy | Information Management | Mutual Information-based Estimation | SIGIR 2010 | Translation Model |

claim paper

Post Info
More Details (n/a)

Added	16 Aug 2010
Updated	16 Aug 2010
Type	Conference
Year	2010
Where	SIGIR
Authors	Maryam Karimzadehgan, ChengXiang Zhai

Comments (0)

Sciweavers

Estimation of statistical translation models based on mutual information for ad hoc information retrieval

Improve Retrieval Accuracy | Information Management | Mutual Information-based Estimation | SIGIR 2010 | Translation Model |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers