Sciweavers

ICASSP
2009
IEEE

Incorporating spectral subtraction and noise type for unvoiced speech segregation

14 years 6 months ago
Incorporating spectral subtraction and noise type for unvoiced speech segregation
Unvoiced speech poses a big challenge to current monaural speech segregation systems. It lacks harmonic structure and is highly susceptible to interference due to its relatively weak energy. This paper describes a new approach to segregate unvoiced speech from nonspeech interference. The system first estimates a voiced binary mask, and then performs unvoiced speech segregation in two stages: segmentation and grouping. In segmentation, timefrequency units labeled as 0 in the voiced binary mask are first used to estimate the noise energy and spectral subtraction is then performed to generate time-frequency segments in unvoiced intervals. Based on the type of noise, unvoiced segments are grouped either by selecting segments consistent with those generated by onset/offset analysis or by Bayesian classification of acoustic-phonetic features. Systematic evaluation and comparison show that the proposed approach improves the performance of unvoiced speech segregation considerably.
Ke Hu, DeLiang Wang
Added 21 May 2010
Updated 21 May 2010
Type Conference
Year 2009
Where ICASSP
Authors Ke Hu, DeLiang Wang
Comments (0)