An unsupervised web-based topic language model adaptation method

16 years 1 months ago

Download www.irisa.fr

This paper focuses on a solution to better adapt ASR systems, whose language models (LM) are usually trained on topic-independent corpora, to new topics, in particular in the case of broadcast news. We propose a new complete and fully unsupervised technique that selects keywords from each segment using information retrieval methods, to build a thematically coherent adaptation corpus from the Internet. The LM used for the initial transcription is then adapted before rescoring word lattices. Experimental results demonstrate the validity of the proposed adaptation technique with a signiﬁcant reduction of the perplexity after LM adaptation. Word error rates are also improved in some cases though to a lesser extent.

Gwénolé Lecorvé, Guillaume Gr

Real-time Traffic

Better Adapt Asr | Coherent Adaptation Corpus | ICASSP 2008 | LM Adaptation | Signal Processing |

claim paper

» Unsupervised language model adaptation via topic modeling based on named entity hypotheses

» Unsupervised Language Model Adaptation Incorporating Named Entity Information

» Unsupervised Topic Modelling for MultiParty Spoken Discourse

» Vocabulary and language model adaptation using just one speech file

» Evaluating Web Based Instructional Models Using Association Rule Mining

» Exploiting Conversation Structure in Unsupervised Topic Segmentation for Emails

» Unsupervised speaker adaptation for telephone call transcription

» Topic model methods for automatically identifying outofscope resources

Post Info
More Details (n/a)

Added	30 May 2010
Updated	30 May 2010
Type	Conference
Year	2008
Where	ICASSP
Authors	Gwénolé Lecorvé, Guillaume Gravier, Pascale Sébillot

Comments (0)

Sciweavers

An unsupervised web-based topic language model adaptation method

Better Adapt Asr | Coherent Adaptation Corpus | ICASSP 2008 | LM Adaptation | Signal Processing |

Explore & Download

Productivity Tools

Sciweavers