Vocabulary and language model adaptation using just one speech file

14 years 1 months ago

Download research.microsoft.com

This paper investigates unsupervised vocabulary and language model self-adaptation (VLA) from just one speech ﬁle using the web as a knowledge source and without prior knowledge of topic or domain beyond optional ﬁle metadata. Single-ﬁle self adaptation is regularly used for acoustic adaptation, but to date, is rarely used for VLA. The method investigated here uses a ﬁrst-pass transcript or ﬁle metadata to generate web search queries for retrieving texts for adaptation. Various strategies for building queries, retrieving web texts and maximizing out-of-vocabulary (OOV) recovery while constraining vocabulary growth are examined. Signiﬁcant improvements are demonstrated for transcribing and searching recorded lectures and telephone calls. The proposed method is orthogonal with acoustic adaptation and system combination and integrates well in multi-pass recognition architectures.

Sha Meng, Kishan Thambiratnam, Yimeng Lin, Lifang

Real-time Traffic

Acoustic Adaptation | ICASSP 2010 | Language Model Self-adaptation | Signal Processing | ﬁle Metadata |

claim paper

Post Info
More Details (n/a)

Added	06 Dec 2010
Updated	06 Dec 2010
Type	Conference
Year	2010
Where	ICASSP
Authors	Sha Meng, Kishan Thambiratnam, Yimeng Lin, Lifang Wang, Gang Li, Frank Seide

Comments (0)

Sciweavers

Vocabulary and language model adaptation using just one speech file

Acoustic Adaptation | ICASSP 2010 | Language Model Self-adaptation | Signal Processing | ﬁle Metadata |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers