Measuring the similarity between two texts is a fundamental problem in many NLP and IR applications. Among the existing approaches, the cosine measure of the term vectors represen...
Language model (LM) adaptation is often achieved by combining a generic LM with a topic-specific model that is more relevant to the target document. Unlike previous work on unsup...
The recent availability of large collections of text such as the Google 1T 5-gram corpus (Brants and Franz, 2006) and the Gigaword corpus of newswire (Graff, 2003) have made it po...
Performance of n-gram language models depends to a large extent on the amount of training text material available for building the models and the degree to which this text matches...
We present a novel language modeling approach to capturing the query reformulation behavior of Web search users. Based on a framework that categorizes eight different types of “...