Sciweavers

ICASSP
2011
IEEE

Combining HMM-based melody extraction and NMF-based soft masking for separating voice and accompaniment from monaural audio

13 years 2 months ago
Combining HMM-based melody extraction and NMF-based soft masking for separating voice and accompaniment from monaural audio
Modern monaural voice and accompaniment separation systems usually consist of two main modules: melody extraction and timefrequency masking. A main distinction between different separation systems lies in what approaches are used for the two modules. Popular techniques for melody extraction include hidden Markov models (HMMs) and non-negative matrix factorization (NMF), and masking includes hard and soft masking. This paper investigates the flaw of NMF-based melody extraction, and proposes the combination of HMM-based melody extraction (equipped with a newly-defined feature) and NMF-based soft masking. Evaluations on two publicly available databases show that the proposed system reaches state-ofthe-art performance and outperforms several other combinations.
Yun Wang, Zhijian Ou
Added 20 Aug 2011
Updated 20 Aug 2011
Type Journal
Year 2011
Where ICASSP
Authors Yun Wang, Zhijian Ou
Comments (0)