Large vocabulary continuous speech recognition (LVCSR) systems traditionally represent words in terms of smaller subword units. Both during training and during recognition, they re...
Two approaches are proposed for the design of tied-mixture hidden Markov models (TMHMM). One approach improves parameter sharing via partial tying of TMHMM states. To facilitate ty...
The first stage in many pattern recognition tasks is to generate a good set of features from the observed data. Usually, only a single feature space is used. However, in some compl...
This paper presents a framework for maximum a posteriori (MAP) speaker adaptation of state duration distributions in hidden Markov models (HMM). Four key issues of MAP estimation, ...
Perceptual audio coders use an estimated masked threshold for the determination of the maximum permissible just-inaudible noise level introduced by quantization. This estimate is d...