Language models used in current automatic speech recognition systems are trained on general-purpose corpora and are therefore not relevant to transcribe spoken documents dealing w...
In this paper, we compare several approaches for the extraction of modulation frequency features from speech signal using a phoneme recognition system. The general framework in th...
In this paper, we propose a new approach for extracting and representing prosodic features directly from the speech signal. We hypothesize that prosody is linked to linguistic uni...
— In this paper we propose a method that exploits 3D motion-based features between frames of 3D facial geometry sequences for dynamic facial expression recognition. An expressive...
Georgia Sandbach, Stefanos Zafeiriou, Maja Pantic,...
Many human action recognition tasks involve data that can be factorized into multiple views such as body postures and hand shapes. These views often interact with each other over ...