Speech recognition is a computationally demanding task, particularly the stage which uses Viterbi decoding for converting pre-processed speech data into words or sub-word units. W...
Stephen J. Melnikoff, Steven F. Quigley, Martin J....
Data sparseness is an ever dominating problem in automatic emotion recognition. Using artificially generated speech for training or adapting models could potentially ease this: t...
In the `missing data' approach to improving the robustness of automatic speech recognition to added noise, an initial process identifies spectraltemporal regions which are do...
We propose a model for speech recognition that consists of multiple semi-synchronized recognizers operating on a polyphase decomposition of standard speech features. Specifically...
The length of the room impulse response characterizing the acoustic path between speaker and microphone is significantly larger than the length of the analysis window used for fea...