Real time speaker localization and detection system for camera steering in multiparticipant videoconferencing environments

13 years 4 months ago

Download mirlab.org

A real time speaker localization and detection system for videoconferencing environments is presented. In this system, a recently proposed modiﬁed Steered Response Power - Phase Transform (SRP-PHAT) algorithm has been used as the core processing scheme. The new SRP-PHAT functional has been shown to provide robust localization performance in indoor environments without the need for having a very ﬁne spatial grid, thus reducing the computational cost required in a practical implementation. Moreover, it has been demonstrated that the statistical distribution of location estimates when a speaker is active can be successfully used to discriminate between speech and non-speech frames by using a criterion of peakedness. As a result, talking participants can be detected and located with signiﬁcant accuracy following a common processing framework.

Amparo Marti, Maximo Cobos, José J. L&oacut

Real-time Traffic

Core Processing Scheme | ICASSP 2011 | Robust Localization Performance | Signal Processing | Time Speaker Localization |

claim paper

Post Info
More Details (n/a)

Added	20 Aug 2011
Updated	20 Aug 2011
Type	Journal
Year	2011
Where	ICASSP
Authors	Amparo Marti, Maximo Cobos, José J. López

Comments (0)

Sciweavers

Real time speaker localization and detection system for camera steering in multiparticipant videoconferencing environments

Core Processing Scheme | ICASSP 2011 | Robust Localization Performance | Signal Processing | Time Speaker Localization |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers