Multi-Modal Emotion Recognition Using Canonical Correlations and Acoustic Features

14 years 7 months ago

Download luks.fe.uni-lj.si

The information of the psycho-physical state of the subject is becoming a valuable addition to the modern audio or video recognition systems. As well as enabling a better user experience, it can also assist in superior recognition accuracy of the base system. In the article, we present our approach to multi-modal (audio-video) emotion recognition system. For audio sub-system, a feature set comprised of prosodic, spectral and cepstrum features is selected and support vector classiﬁer is used to produce the scores for each emotional category. For video sub-system a novel approach is presented, which does not rely on the tracking of speciﬁc facial landmarks and thus, eliminates the problems usually caused, if the tracking algorithm fails at detecting the correct area. The system is evaluated on the eNTERFACE database and the recognition accuracy of our audio-video fusion is compared to the published results in the literature.

Rok Gajsek, Vitomir Struc, France Mihelic

Real-time Traffic

Computer Vision | ICPR 2010 | Recognition Accuracy | Recognition System | Superior Recognition Accuracy |

claim paper

Post Info
More Details (n/a)

Added	23 Jun 2010
Updated	23 Jun 2010
Type	Conference
Year	2010
Where	ICPR
Authors	Rok Gajsek, Vitomir Struc, France Mihelic

Comments (0)

Sciweavers

Multi-Modal Emotion Recognition Using Canonical Correlations and Acoustic Features

Computer Vision | ICPR 2010 | Recognition Accuracy | Recognition System | Superior Recognition Accuracy |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers