Sciweavers

COLING
2010

Latent Mixture of Discriminative Experts for Multimodal Prediction Modeling

13 years 7 months ago
Latent Mixture of Discriminative Experts for Multimodal Prediction Modeling
During face-to-face conversation, people naturally integrate speech, gestures and higher level language interpretations to predict the right time to start talking or to give backchannel feedback. In this paper we introduce a new model called Latent Mixture of Discriminative Experts which addresses some of the key issues with multimodal language processing: (1) temporal synchrony/asynchrony between modalities, (2) micro dynamics and (3) integration of different levels of interpretation. We present an empirical evaluation on listener nonverbal feedback prediction (e.g., head nod), based on observable behaviors of the speaker. We confirm the importance of combining four types of multimodal features: lexical, syntactic structure, eye gaze, and prosody. We show that our Latent Mixture of Discriminative Experts model outperforms previous approaches based on Conditional Random Fields (CRFs) and Latent-Dynamic CRFs.
Derya Ozkan, Kenji Sagae, Louis-Philippe Morency
Added 13 May 2011
Updated 13 May 2011
Type Journal
Year 2010
Where COLING
Authors Derya Ozkan, Kenji Sagae, Louis-Philippe Morency
Comments (0)