Audio-Visual Speaker Localization Using Graphical Models

16 years 3 months ago

Download www.di.ens.fr

In this work we propose an approach to combine audio and video modalities for person tracking using graphical models. We demonstrate a principled and intuitive framework for combining these modalities to obtain robustness against occlusion and change in appearance. We further exploit the temporal correlations that exist for a moving object between adjacent frames to account for the cases where having both modalities might still not be enough, e.g., when the person being tracked is occluded and not speaking. Improvement in tracking results is shown at each step and compared with manually annotated ground truth.

Akash Kushal, Mandar Rahurkar, Fei-Fei Li 0002, Je

Real-time Traffic

Computer Vision | ICPR 2006 | Person Tracking | Temporal Correlations | Video Modalities |

claim paper

» Learning Joint Statistical Models for AudioVisual Fusion and Segregation

» Probabilistic Speaker Localization in Noisy Environments by AudioVisual Integration

» The AIT 3D Audio Visual Person Tracker for CLEAR 2007

» Rapid Feature Space Speaker Adaptation for MultiStream HMMBased AudioVisual Speech Recogni...

» Multimodal multispeaker probabilistic tracking in meetings

» Visual speaker localization aided by acoustic models

» A segmentbased audiovisual speech recognizer data collection development and initial exper...

» Separation and tracking of multiple speakers in a reverberant environment using a multiple...

Post Info
More Details (n/a)

Added	09 Nov 2009
Updated	09 Nov 2009
Type	Conference
Year	2006
Where	ICPR
Authors	Akash Kushal, Mandar Rahurkar, Fei-Fei Li 0002, Jean Ponce, Thomas S. Huang

Comments (0)

Sciweavers

Audio-Visual Speaker Localization Using Graphical Models

Computer Vision | ICPR 2006 | Person Tracking | Temporal Correlations | Video Modalities |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers