Power law discounting for n-gram language models

14 years 3 months ago

Download homepages.inf.ed.ac.uk

We present an approximation to the Bayesian hierarchical PitmanYor process language model which maintains the power law distribution over word tokens, while not requiring a computationally expensive approximate inference process. This approximation, which we term power law discounting, has a similar computational complexity to interpolated and modiﬁed Kneser-Ney smoothing. We performed experiments on meeting transcription using the NIST RT06s evaluation data and the AMI corpus, with a vocabulary of 50,000 words and a language model training set of up to 211 million words. Our results indicate that power law discounting results in statistically signiﬁcant reductions in perplexity and word error rate compared to both interpolated and modiﬁed Kneser-Ney smoothing, while producing similar results to the hierarchical Pitman-Yor process language model.

Songfang Huang, Steve Renals

Real-time Traffic

ICASSP 2010 | Language Model | Power Law | Process Language Model | Signal Processing |

claim paper

Post Info
More Details (n/a)

Added	06 Dec 2010
Updated	06 Dec 2010
Type	Conference
Year	2010
Where	ICASSP
Authors	Songfang Huang, Steve Renals

Comments (0)

Sciweavers

Power law discounting for n-gram language models

ICASSP 2010 | Language Model | Power Law | Process Language Model | Signal Processing |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers