Sciweavers

ICASSP
2009
IEEE

Modelling the prepausal lengthening effect for speech recognition: a dynamic Bayesian network approach

14 years 6 months ago
Modelling the prepausal lengthening effect for speech recognition: a dynamic Bayesian network approach
Speech has a property that the speech unit preceding a speech pause tends to lengthen. This work presents the use of a dynamic Bayesian network to model the prepausal lengthening effect for robust speech recognition. Specifically, we introduce two distributions to model inter-state transitions in prepausal and non-prepausal words, respectively. The selection of the transition distributions depends on a random variable whose value is influenced by whether a pause will appear between the current and the following word. Two experiments are presented here. The first one considers pauses hypothesised during speech decoding. The second one employs an extra component for speech/non-speech determination. By modelling the prepausal lengthening effect we achieve a 5.5% relative reduction in word error rate on the 500-word task of the SVitchboard corpus.
Ning Ma, Chris Bartels, Jeff A. Bilmes, Phil Green
Added 21 May 2010
Updated 21 May 2010
Type Conference
Year 2009
Where ICASSP
Authors Ning Ma, Chris Bartels, Jeff A. Bilmes, Phil Green
Comments (0)