Semi-blind Speech-Music Separation Using Sparsity and Continuity Priors

15 years 6 months ago

Download people.sabanciuniv.edu

—In this paper we propose an approach for the problem of single channel source separation of speech and music signals. Our approach is based on representing each source’s power spectral density using dictionaries and nonlinearly projecting the mixture signal spectrum onto the combined span of the dictionary entries. We encourage sparsity and continuity of the dictionary coefﬁcients using penalty terms (or log-priors) in an optimization framework. We propose to use a novel coordinate descent technique for optimization, which nicely handles nonnegativity constraints and nonquadratic penalty terms. We use an adaptive Wiener ﬁlter, and spectral subtraction to reconstruct both of the sources from the mixture data after corresponding power spectral densities (PSDs) are estimated for each source. Using conventional metrics, we measure the performance of the system on simulated mixtures of single person speech and piano music sources. The results indicate that the proposed method is a ...

Hakan Erdogan, Emad M. Grais

Real-time Traffic

Channel Source Separation | Computer Vision | ICPR 2010 | Penalty Terms | Power Spectral Density |

claim paper

Post Info
More Details (n/a)

Added	07 Dec 2010
Updated	07 Dec 2010
Type	Conference
Year	2010
Where	ICPR
Authors	Hakan Erdogan, Emad M. Grais

Comments (0)

Sciweavers

Semi-blind Speech-Music Separation Using Sparsity and Continuity Priors

Channel Source Separation | Computer Vision | ICPR 2010 | Penalty Terms | Power Spectral Density |

Explore & Download

Productivity Tools

Sciweavers