This paper describes experiments in automatic recognition of context-independent phoneme strings from meeting data using audiovisual features. Visual features are known to improve ...
We address the problem of segmentation and recognition of sequences of multimodal human interactions in meetings. These interactions can be seen as a rough structure of a meeting, ...
Marc Al-Hames, Alfred Dielmann, Daniel Gatica-Pere...
We investigate the combination of several sources of information for the purpose of subjectivity recognition and polarity classification in meetings. We focus on features from two...
Stephan Raaijmakers, Khiet P. Truong, Theresa Wils...
In this work we present a novel multi-modal mixed-state dynamic Bayesian network (DBN) for robust meeting event classification. The model uses information from lapel microphones,...
This article introduces automatic speech recognition based on Electro-Magnetic Articulography (EMA). Movements of the tongue, lips, and jaw are tracked by an EMA device, which are...