Multimodal multispeaker probabilistic tracking in meetings

16 years 12 days ago

Download www.idiap.ch

Tracking speakers in multiparty conversations constitutes a fundamental task for automatic meeting analysis. In this paper, we present a probabilistic approach to jointly track the location and speaking activity of multiple speakers in a multisensor meeting room, equipped with a small microphone array and multiple uncalibrated cameras. Our framework is based on a mixed-state dynamic graphical model deﬁned on a multiperson state-space, which includes the explicit deﬁnition of a proximity-based interaction model. The model integrates audio-visual (AV) data through a novel observation model. Audio observations are derived from a source localization algorithm. Visual observations are based on models of the shape and spatial structure of human heads. Approximate inference in our model, needed given its complexity, is performed with a Markov Chain Monte Carlo particle ﬁlter (MCMC-PF), which results in high sampling eﬃciency. We present results -based on an objective evaluation proce...

Daniel Gatica-Perez, Guillaume Lathoud, Jean-Marc

Real-time Traffic

ICMI 2005 | Markov Chain Monte Carlo | Multiple Uncalibrated Cameras | Speaking Activity |

claim paper

Added	27 Jun 2010
Updated	27 Jun 2010
Type	Conference
Year	2005
Where	ICMI
Authors	Daniel Gatica-Perez, Guillaume Lathoud, Jean-Marc Odobez, Iain McCowan

Sciweavers

Multimodal multispeaker probabilistic tracking in meetings

ICMI 2005 | Markov Chain Monte Carlo | Multiple Uncalibrated Cameras | Speaking Activity |

Explore & Download

Productivity Tools

Sciweavers