Traditional n-gram language models are widely used in state-of-the-art large vocabulary speech recognition systems. This simple model suffers from some limitations, such as overfi...
We propose a new hierarchical Bayesian n-gram model of natural languages. Our model makes use of a generalization of the commonly used Dirichlet distributions called Pitman-Yor pr...
In this paper, we propose a new application of Bayesian language model based on Pitman-Yor process for information retrieval. This model is a generalization of the Dirichlet distr...
The inclusion of document length factors has been a major topic in the development of retrieval models. We believe that current models can be further improved by more refined est...
Abstract. Robustly estimating the state-transition probabilities of highorder Markov processes is an essential task in many applications such as natural language modeling or protei...