Sciweavers

ICASSP
2011
IEEE

Discriminatively estimated discrete, parametric and smoothed-discrete duration models for speech recognition

13 years 2 months ago
Discriminatively estimated discrete, parametric and smoothed-discrete duration models for speech recognition
Duration of phonemic segments provide important cues for distinguishing words in languages such as Arabic. Recently, we proposed a discriminatively estimated joint acoustic, duration and language model for large vocabulary speech recognition [1]. In that work, we found simple discrete models to be effective for modeling duration, albeit they were neither smoothed nor parsimonious. These limitations are addressed here with two alternative models – parametric and smoothed-discrete models. Unlike previous work on parametric duration model, we estimate their parameters discriminatively and derive an analytical expression for estimating the parameters of a log-normal distribution using a recent approach [2]. On a large vocabulary Arabic task, we empirically evaluated different segmental units and durations models. Our results show bigrams of clustered states modeled with smoothed-discrete duration models are relatively more accurate and efficient than other models considered.
Maider Lehr, Izhak Shafran
Added 21 Aug 2011
Updated 21 Aug 2011
Type Journal
Year 2011
Where ICASSP
Authors Maider Lehr, Izhak Shafran
Comments (0)