We consider estimation of the noise spectral variance from speech signals contaminated by highly nonstationary noise sources. In each time frame, for each frequency bin, the noise variance estimate is updated recursively with the Minimum Mean-Square Error (MMSE) estimate of the current noise power. For the estimation of the noise power, a spectral gain function is used, which is found by an iterative data-driven training method. The proposed noise tracking method can accurately track fast changes in noise level (up to about 10 dB/s). When compared to the Minimum Statistics method for various noise sources in a speech enhancement system, improvements in segmental signal-to-noise ratio of more than 1 dB are obtained.
Jan S. Erkelens, Richard Heusdens