In this paper we present a new method for locating multiple sound sources using only a local segment of data from a large-aperture microphone array. The result of this work may be used directly or as an open-loop input to a tracking algorithm. The proposed method employs the provenrobust steered response power using the phase transform as a functional, agglomerative clustering, and low-cost global optimization (stochastic region contraction). Testing on real data from five talkers in a noisy environment, we show that, for each frame, our method finds correct locations of active sources under high noise and reverberation conditions without a priori knowledge of the number of sources.
Hoang Do, Harvey F. Silverman