A Classification-based Cocktail-party Processor

15 years 3 months ago

Download www.cse.ohio-state.edu

At a cocktail party, a listener can selectively attend to a single voice and filter out other acoustical interferences. How to simulate this perceptual ability remains a great challenge. This paper describes a novel supervised learning approach to speech segregation, in which a target speech signal is separated from interfering sounds using spatial location cues: interaural time differences (ITD) and interaural intensity differences (IID). Motivated by the auditory masking effect, we employ the notion of an ideal time-frequency binary mask, which selects the target if it is stronger than the interference in a local time-frequency unit. Within a narrow frequency band, modifications to the relative strength of the target source with respect to the interference trigger systematic changes for estimated ITD and IID. For a given spatial configuration, this interaction produces characteristic clustering in the binaural feature space. Consequently, we perform pattern classification in order t...

Nicoleta Roman, DeLiang L. Wang, Guy J. Brown

Real-time Traffic

Binary Masks | NIPS 2003 | NIPS 2007 | Target Speech Signal | Time-frequency Binary Mask |

claim paper

Post Info
More Details (n/a)

Added	31 Oct 2010
Updated	31 Oct 2010
Type	Conference
Year	2003
Where	NIPS
Authors	Nicoleta Roman, DeLiang L. Wang, Guy J. Brown

Comments (0)

Sciweavers

A Classification-based Cocktail-party Processor

Binary Masks | NIPS 2003 | NIPS 2007 | Target Speech Signal | Time-frequency Binary Mask |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers