Probabilistic Inference of Speech Signals from Phaseless Spectrograms

15 years 8 months ago

Download books.nips.cc

Many techniques for complex speech processing such as denoising and deconvolution, time/frequency warping, multiple speaker separation, and multiple microphone analysis operate on sequences of short-time power spectra (spectrograms), a representation which is often well-suited to these tasks. However, a signiﬁcant problem with algorithms that manipulate spectrograms is that the output spectrogram does not include a phase component, which is needed to create a time-domain signal that has good perceptual quality. Here we describe a generative model of time-domain speech signals and their spectrograms, and show how an efﬁcient optimizer can be used to ﬁnd the maximum a posteriori speech signal, given the spectrogram. In contrast to techniques that alternate between estimating the phase and a spectrally-consistent signal, our technique directly infers the speech signal, thus jointly optimizing the phase and a spectrally-consistent signal. We compare our technique with a standard met...

Kannan Achan, Sam T. Roweis, Brendan J. Frey

Real-time Traffic

NIPS 2003 | NIPS 2007 | Perceptual Quality | Posteriori Speech Signal | Speech Signal |

claim paper

Added	31 Oct 2010
Updated	31 Oct 2010
Type	Conference
Year	2003
Where	NIPS
Authors	Kannan Achan, Sam T. Roweis, Brendan J. Frey

Sciweavers

Probabilistic Inference of Speech Signals from Phaseless Spectrograms

NIPS 2003 | NIPS 2007 | Perceptual Quality | Posteriori Speech Signal | Speech Signal |

Explore & Download

Productivity Tools

Sciweavers