In most microphone array applications, it is essential to localize sources in a noisy, reverberant environment. It has been shown that computing the steered response power(SRP) is more robust than faster, two-stage, direct time-difference of arrival methods. The problem with computing SRP is that the SRP space has many local maxima and thus computationallyintensive grid-search methods are used to find a global maximum. Grid search is too expensive for a real-time system. Several papers have addressed this issue. In this paper we propose using stochastic region contraction(SRC) to make computing the SRP practical. We discuss one important SRP method, computing it from the phase transform (SRP-PHAT), review SRC, and show the computational saving. Using real data from human talkers, we show that SRC saves computation by more than two orders of magnitude with almost no loss in accuracy.
Hoang Do, Harvey F. Silverman, Ying Yu