We describe a scheme to combine the results of audio and face identification for multimedia indexing and retrieval. Audio analysis consists of speech and speaker recognition deri...
Mahesh Viswanathan, Homayoon S. M. Beigi, Alain Tr...
Emotion expression is an essential part of human interaction. Rich emotional information is conveyed through the human face. In this study, we analyze detailed motion-captured fac...
Angeliki Metallinou, Carlos Busso, Sungbok Lee, Sh...
—Techniques for automatic annotation of spoken content making use of speech recognition technology have long been characterized as holding unrealized promise to provide access to...
Roeland Ordelman, Franciska de Jong, Martha Larson
An automated system is presented for reducing a multi-view lecture recording into a single view video containing a best view summary of active speakers. The system uses skin color...
In conventional speaker recognition methods based on MFCC, the phase information has been ignored. Recently, we proposed a method that integrated MFCC with the phase information o...