Sciweavers

TASLP
2016

Complex Ratio Masking for Monaural Speech Separation

8 years 8 months ago
Complex Ratio Masking for Monaural Speech Separation
—Speech separation systems usually operate on the short-time Fourier transform (STFT) of noisy speech, and enhance only the magnitude spectrum while leaving the phase spectrum unchanged. This is done because there was a belief that the phase spectrum is unimportant for speech enhancement. Recent studies, however, suggest that phase is important for perceptual quality, leading some researchers to consider magnitude and phase spectrum enhancements. We present a supervised monaural speech separation approach that simultaneously enhances the magnitude and phase spectra by operating in the complex domain. Our approach uses a deep neural network to estimate the real and imaginary components of the ideal ratio mask defined in the complex domain. We report separation results for the proposed method and compare them to related systems. The proposed approach improves over other methods when evaluated with several objective metrics, including the perceptual evaluation of speech quality (PESQ),...
Donald S. Williamson, Yuxuan Wang, DeLiang Wang
Added 10 Apr 2016
Updated 10 Apr 2016
Type Journal
Year 2016
Where TASLP
Authors Donald S. Williamson, Yuxuan Wang, DeLiang Wang
Comments (0)