This paper introduces a method to train an error-corrective model for Automatic Speech Recognition (ASR) without using audio data. In existing techniques, it is assumed that sufï¬...
The use of visual information from lip movements can improve the accuracy and robustness of a speech recognition system. Accurate extraction of visual features associated with the...
Alan Wee-Chung Liew, Shu Hung Leung, Wing Hong Lau
Audio segmentation has applications in a variety of contexts, such as audio information retrieval, automatic sound analysis, and as a pre-processing step in speech recognition. Ex...
Tara N. Sainath, Dimitri Kanevsky, Giridharan Iyen...
Automatic speech recognition (ASR) results contain not only ASR errors, but also disfluencies and colloquial expressions that must be corrected to create readable transcripts. We...
Graham Neubig, Yuya Akita, Shinsuke Mori, Tatsuya ...
Abstract. The recognition of the emotional states of speaker is a multidisciplinary research area that has received great interest in the last years. One of the most important goal...
Enrique M. Albornoz, Diego H. Milone, Hugo Leonard...