In this paper we address the problem of building a good speech recognizer if there is only a small amount of training data available. The acoustic models can be improved by interpo...
Stefan Steidl, Georg Stemmer, Christian Hacker, El...
Speech recognition is a computationally demanding task, particularly the stage which uses Viterbi decoding for converting pre-processed speech data into words or sub-word units. W...
Stephen J. Melnikoff, Steven F. Quigley, Martin J....
The first stage in many pattern recognition tasks is to generate a good set of features from the observed data. Usually, only a single feature space is used. However, in some compl...
The new model reduces the impact of local spectral and temporal variability by estimating a finite set of spectral and temporal warping factors which are applied to speech at the f...
Antonio Miguel, Eduardo Lleida, Richard Rose, Luis...
The fusion of information from heterogenous sensors is crucial to the effectiveness of a multimodal system. Noise affect the sensors of different modalities independently. A good ...
Shankar T. Shivappa, Bhaskar D. Rao, Mohan M. Triv...