This paper presents a bottom-up approach that combines audio and video to simultaneously locate individual speakers in the video (2-D source localization) and segment their speech ...
In this paper, we propose a computational model for social interaction between three people in a conversation, and demonstrate results using human video motion synthesis. We utilis...
Dumebi Okwechime, Eng-Jon Ong, Andrew Gilbert, Ric...
—The research area of Multimedia Content Analysis (MMCA) considers all aspects of the automated extraction of new knowledge from large multimedia data streams and archives. In re...
R. Yang, Robert D. van der Mei, D. Roubos, Frank J...
A novel framework for spatially estimating unknown image data is presented. Common applications include inpainting, concealment of transmission errors, prediction in video coding,...
Haricharan Lakshman, Patrick Ndjiki-Nya, Martin K&...
We are developing a testbed for learning by demonstration combining spoken language and sensor data in a natural real-world environment. Microsoft Kinect RGBDepth cameras allow us...