Combining HMM-based melody extraction and NMF-based soft masking for separating voice and accompaniment from monaural audio

13 years 6 months ago

Download mirlab.org

Modern monaural voice and accompaniment separation systems usually consist of two main modules: melody extraction and timefrequency masking. A main distinction between different separation systems lies in what approaches are used for the two modules. Popular techniques for melody extraction include hidden Markov models (HMMs) and non-negative matrix factorization (NMF), and masking includes hard and soft masking. This paper investigates the ﬂaw of NMF-based melody extraction, and proposes the combination of HMM-based melody extraction (equipped with a newly-deﬁned feature) and NMF-based soft masking. Evaluations on two publicly available databases show that the proposed system reaches state-ofthe-art performance and outperforms several other combinations.

Yun Wang, Zhijian Ou

Real-time Traffic

ICASSP 2011 | Melody Extraction | Separation Systems | Signal Processing | Soft Masking |

claim paper

Post Info
More Details (n/a)

Added	20 Aug 2011
Updated	20 Aug 2011
Type	Journal
Year	2011
Where	ICASSP
Authors	Yun Wang, Zhijian Ou

Comments (0)

Sciweavers

Combining HMM-based melody extraction and NMF-based soft masking for separating voice and accompaniment from monaural audio

ICASSP 2011 | Melody Extraction | Separation Systems | Signal Processing | Soft Masking |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers