Sciweavers

ICASSP
2011
IEEE

Automatically finding semantically consistent n-grams to add new words in LVCSR systems

13 years 5 months ago
Automatically finding semantically consistent n-grams to add new words in LVCSR systems
This paper presents a new method to automatically add n-grams containing out-of-vocabulary (OOV) words to a baseline language model (LM), where these n-grams are sought to be grammatically correct and to make sense according to the meaning of OOV words. First, this method consists in determining the word sequences, i.e., n-grams, in which the usage of a given OOV word is the most semantically consistent. Then, conditional probabilities of these n-grams have to be computed. To do this, semantic relations between words are used to assimilate each OOV word to several equivalent invocabulary words. Based on these last words, n-grams from the baseline LM are re-used to find the word sequences to be added and to compute their probabilities. After augmenting the vocabulary and launching a recognition process, experiments show that our method results in WER improvements which are comparable to those obtained using a state-of-the-art open vocabulary LM.
Gwénolé Lecorvé, Guillaume Gr
Added 20 Aug 2011
Updated 20 Aug 2011
Type Journal
Year 2011
Where ICASSP
Authors Gwénolé Lecorvé, Guillaume Gravier, Pascale Sébillot
Comments (0)