Recently, Deep Belief Networks (DBNs) have been proposed for phone recognition and were found to achieve highly competitive performance. In the original DBNs, only framelevel info...
This paper investigates the use of speech-to-text methods for assigning an emotion class to a given speech utterance. Previous work shows that an emotion extracted from text can c...
Florian Metze, Anton Batliner, Florian Eyben, Tim ...
Most existing binaural approaches to speech segregation rely on spatial filtering. In environments with minimal reverberation and when sources are well separated in space, spatial...
John Woodruff, Rohit Prabhavalkar, Eric Fosler-Lus...
In this paper, we propose a joint optimal method for automatic speech recognition (ASR) and ideal binary mask (IBM) estimation in transformed into the cepstral domain through a ne...
Lae-Hoon Kim, Kyung-Tae Kim, Mark Hasegawa-Johnson
In this paper, we propose a novel boosted mixture learning (BML) framework for Gaussian mixture HMMs in speech recognition. BML is an incremental method to learn mixture models fo...