Automated analysis of human affective behavior has attracted increasing attention in recent years. With the research shift toward spontaneous behavior, many challenges have come to surface ranging from database collection strategies to the use of new feature sets (e.g., lexical cues apart from prosodic features). Use of contextual information, however, is rarely addressed in the field of affect expression recognition, yet it is evident that affect recognition by human is largely influenced by the context information. Our contribution in this paper is threefold. First, we introduce a novel set of features based on cepstrum analysis of pitch and intensity contours. We evaluate the usefulness of these features on two different databases: Berlin Database of emotional speech (EMO-DB) and locally collected audiovisual database in car settings (CVRRCar-AVDB). The overall recognition accuracy achieved for seven emotions in the EMO-DB database is over 84% and over 87% for three emotion classes ...
Ashish Tawari, Mohan M. Trivedi