Stream-based Randomised Language Models for SMT

14 years 12 months ago

Download aclweb.org

Randomised techniques allow very big language models to be represented succinctly. However, being batch-based they are unsuitable for modelling an unbounded stream of language whilst maintaining a constant error rate. We present a novel randomised language model which uses an online perfect hash function to efficiently deal with unbounded text streams. Translation experiments over a text stream show that our online randomised model matches the performance of batch-based LMs without incurring the computational overhead associated with full retraining. This opens up the possibility of randomised language models which continuously adapt to the massive volumes of texts published on the Web each day.

Abby Levenberg, Miles Osborne

Real-time Traffic

Big Language Models | EMNLP 2009 | Language Models | Natural Language Processing | Text Stream |

claim paper

Added	17 Feb 2011
Updated	17 Feb 2011
Type	Journal
Year	2009
Where	EMNLP
Authors	Abby Levenberg, Miles Osborne

Sciweavers

Stream-based Randomised Language Models for SMT

Big Language Models | EMNLP 2009 | Language Models | Natural Language Processing | Text Stream |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers