Combining HMM-based melody extraction and NMF-based soft masking for separating voice and accompaniment from monaural audio

14 years 10 months ago

Download mirlab.org

Modern monaural voice and accompaniment separation systems usually consist of two main modules: melody extraction and timefrequency masking. A main distinction between different separation systems lies in what approaches are used for the two modules. Popular techniques for melody extraction include hidden Markov models (HMMs) and non-negative matrix factorization (NMF), and masking includes hard and soft masking. This paper investigates the ﬂaw of NMF-based melody extraction, and proposes the combination of HMM-based melody extraction (equipped with a newly-deﬁned feature) and NMF-based soft masking. Evaluations on two publicly available databases show that the proposed system reaches state-ofthe-art performance and outperforms several other combinations.

Yun Wang, Zhijian Ou

Real-time Traffic

ICASSP 2011 | Melody Extraction | Separation Systems | Signal Processing | Soft Masking |

claim paper

Post Info
More Details (n/a)

Added	20 Aug 2011
Updated	20 Aug 2011
Type	Journal
Year	2011
Where	ICASSP
Authors	Yun Wang, Zhijian Ou

Comments (0)

Sciweavers

Combining HMM-based melody extraction and NMF-based soft masking for separating voice and accompaniment from monaural audio

ICASSP 2011 | Melody Extraction | Separation Systems | Signal Processing | Soft Masking |

Explore & Download

Productivity Tools

Sciweavers