Sciweavers

KDD
2000
ACM

Learning Prosodic Patterns for Mandarin Speech Synthesis

14 years 3 months ago
Learning Prosodic Patterns for Mandarin Speech Synthesis
Higher quality synthesized speech is required for widespread use of text-to-speech (TTS) technology, and prosodic pattern is the key feature that makes synthetic speech sound unnatural and monotonous, which mainly describes the variation of pitch. The rules that are now being used in most Chinese TTS systems are constructed by experts, qualitatively and with low precision. In this paper, we propose a combination of clustering and machine learning techniques to extract prosodic patterns from actual large mandarin speech database to improve the naturalness and intelligibility of synthesized speech. Typical prosody models are found by clustering analysis, some machine learning techniques including Rough Set, ANN and Decision tree are trained respectively for fundamental frequency and energy contours, which can be directly used in a pitch-synchronous-overlap-add-based (PSOLA-based) TTS system. The experimental results showed that synthesized prosodic features quite resembled their origina...
Yiqiang Chen, Wen Gao, Tingshao Zhu
Added 25 Aug 2010
Updated 25 Aug 2010
Type Conference
Year 2000
Where KDD
Authors Yiqiang Chen, Wen Gao, Tingshao Zhu
Comments (0)