Sciweavers

ICASSP
2010
IEEE

Morphology-based and sub-word language modeling for Turkish speech recognition

13 years 11 months ago
Morphology-based and sub-word language modeling for Turkish speech recognition
We explore morphology-based and sub-word language modeling approaches proposed for morphologically rich languages, and evaluate and contrast them for Turkish broadcast news transcription task. In addition, as a morphology-based model, we improve our previously proposed morphology-integrated model for automatic speech recognition. This model is built by composing the finite-state transducer of the morphological parser with a language model over lexical morphemes. This approach provides a morphology-integrated search network with an unlimited vocabulary, generating only valid word forms while reducing the out-of-vocabulary rate and hence improving the word error rate. We also analyze the effect of morphotactics and morphological disambiguation on the speech recognition accuracy for the morphology-integrated model. The improved morphology-integrated model performs better than statistically derived sub-word models with added benefit of generating morphosyntactic and semantic features.
Hasim Sak, Murat Saraclar, Tunga Güngör
Added 06 Dec 2010
Updated 06 Dec 2010
Type Conference
Year 2010
Where ICASSP
Authors Hasim Sak, Murat Saraclar, Tunga Güngör
Comments (0)