For TV and radio shows containing narrowband speech, Speech-to-text (STT) accuracy on the narrowband audio can be improved by using an acoustic model trained on acoustically match...
This paper presents a bottom-up approach that combines audio and video to simultaneously locate individual speakers in the video (2-D source localization) and segment their speech ...
In the framework of a face verification system using local features and a Gaussian Mixture Model based classifier, we address the problem of non-frontal face verification (when on...
Abstract. The recognition of the emotional states of speaker is a multidisciplinary research area that has received great interest in the last years. One of the most important goal...
Enrique M. Albornoz, Diego H. Milone, Hugo Leonard...