We propose a technique to separate audio sources from their anechoic mixtures with long delay in an underdetermined setting (i.e., the number of audio sensors is smaller than that of sources). It consists of two stages: 1) to estimate anechoic mixing parameters of attenuation and arrival delay and 2) to recover original audio sources based on estimated mixing parameters. When delay is longer than one sample, previous algorithms perform poorly. To address this shortcoming, we estimate the maximum delay and use it to find a proper frequency range that produces no phase ambiguity. Then, we determine mixing parameters with time-frequency points in this range. Finally, mathematical tools are used to solve the underdetermined linear system to recover original audio sources. The effectiveness of the proposed technique on various mixing scenarios with noisy observation of mixtures and different types of sounds is demonstrated by experimental results.
Namgook Cho, C.-C. Jay Kuo