Sciweavers

ACL
2007

Bilingual-LSA Based LM Adaptation for Spoken Language Translation

14 years 1 months ago
Bilingual-LSA Based LM Adaptation for Spoken Language Translation
We propose a novel approach to crosslingual language model (LM) adaptation based on bilingual Latent Semantic Analysis (bLSA). A bLSA model is introduced which enables latent topic distributions to be efficiently transferred across languages by enforcing a one-to-one topic correspondence during training. Using the proposed bLSA framework crosslingual LM adaptation can be performed by, first, inferring the topic posterior distribution of the source text and then applying the inferred distribution to the target language N-gram LM via marginal adaptation. The proposed framework also enables rapid bootstrapping of LSA models for new languages based on a source LSA model from another language. On Chinese to English speech and text translation the proposed bLSA framework successfully reduced word perplexity of the English LM by over 27% for a unigram LM and up to 13.6% for a 4-gram LM. Furthermore, the proposed approach consistently improved machine translation quality on both speech and ...
Yik-Cheung Tam, Ian R. Lane, Tanja Schultz
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2007
Where ACL
Authors Yik-Cheung Tam, Ian R. Lane, Tanja Schultz
Comments (0)