The paper describes initial results in an ongoing project aimed at providing and analyzing standardized, representative data sets for typical context recognition tasks. Such data ...
Ernst A. Heinz, Kai S. Kunze, Stefan Sulistyo, Hol...
Deep Belief Networks (DBNs) are multi-layer generative models. They can be trained to model windows of coefficients extracted from speech and they discover multiple layers of fea...
Abdel-rahman Mohamed, Tara N. Sainath, George Dahl...
The aim of this study is to apply a state-of-the-art speech emotion recognition engine on the detection of microsleep endangered sleepiness states. Current approaches in speech em...
This paper proposes a new method for bimodal information fusion in audio-visual speech recognition, where cross-modal association is considered in two levels. First, the acoustic a...