HNM-based MFCC+F0 extractor applied to statistical speech synthesis

14 years 10 months ago

Download mirlab.org

Currently, the statistical framework based on Hidden Markov Models (HMMs) plays a relevant role in speech synthesis, while voice conversion systems based on Gaussian Mixture Models (GMMs) are almost standard. In both cases, statistical modeling is applied to learn distributions of acoustic vectors extracted from speech signals, each vector containing a suitable parametric representation of one speech frame. The overall performance of the systems is often limited by the accuracy of the underlying speech parameterization and reconstruction method. The method presented in this paper allows accurate MFCC extraction and highquality reconstruction of speech signals assuming a Harmonics plus Noise Model (HNM). Its suitability for high-quality HMMbased speech synthesis is shown through subjective tests.

Daniel Erro, Iñaki Sainz, Eva Navas, Inma H

Real-time Traffic

Hidden Markov Models | ICASSP 2011 | Signal Processing | Speech Signals | Speech Synthesis |

claim paper

Post Info
More Details (n/a)

Added	21 Aug 2011
Updated	21 Aug 2011
Type	Journal
Year	2011
Where	ICASSP
Authors	Daniel Erro, Iñaki Sainz, Eva Navas, Inma Hernáez

Comments (0)

Sciweavers

HNM-based MFCC+F0 extractor applied to statistical speech synthesis

Hidden Markov Models | ICASSP 2011 | Signal Processing | Speech Signals | Speech Synthesis |

Explore & Download

Productivity Tools

Sciweavers