Sciweavers

ICMI
2005
Springer

A joint particle filter for audio-visual speaker tracking

14 years 6 months ago
A joint particle filter for audio-visual speaker tracking
In this paper, we present a novel approach for tracking a lecturer during the course of his speech. We use features from multiple cameras and microphones, and process them in a joint particle filter framework. The filter performs sampled projections of 3D location hypotheses and scores them using features from both audio and video. On the video side, the features are based on foreground segmentation, multiview face detection and upper body detection. On the audio side, the time delays of arrival between pairs of microphones are estimated with a generalized cross correlation function. Computationally expensive features are evaluated only at the particles’ projected positions in the respective camera images, thus the complexity of the proposed algorithm is low. We evaluated the system on data that was recorded during actual lectures. The results of our experiments were 36 cm average error for video only tracking, 46 cm for audio only, and 31 cm for the combined audio-video system. C...
Kai Nickel, Tobias Gehrig, Rainer Stiefelhagen, Jo
Added 27 Jun 2010
Updated 27 Jun 2010
Type Conference
Year 2005
Where ICMI
Authors Kai Nickel, Tobias Gehrig, Rainer Stiefelhagen, John W. McDonough
Comments (0)