Sciweavers

ICASSP
2011
IEEE

Generating compound words with high order n-gram information in large vocabulary speech recognition systems

13 years 3 months ago
Generating compound words with high order n-gram information in large vocabulary speech recognition systems
In this work we concentrate on generating compound words with high order n-gram information for speech recognition. In most existing compound words generation methods, only bi-gram information is considered. They are successful for improving the performance of bi-gram models but doesn’t work well in higher order n-gram cases. Since nowadays 3gram and 4-gram language models are commonly used, here we present a high order n-gram based computation to generate compound words automatically in an exact way which is called gradient criterion. We have this method tested on Mandarin Open Voice Search (OVS) task and make 0.62% absolute improvement over the 16.44% baseline. This result also outperforms the traditional mutual information based methods. Further the history effect and prediction effect of this criterion are tested and we find history effect plays a more important role in the decoding task.
Jie Zhou, Qin Shi, Yong Qin
Added 20 Aug 2011
Updated 20 Aug 2011
Type Journal
Year 2011
Where ICASSP
Authors Jie Zhou, Qin Shi, Yong Qin
Comments (0)