Tonal context labeling using quantized F0 symbols for improving tone correctness in average-voice-based speech synthesis

14 years 11 months ago

Download mirlab.org

This paper proposes a technique for improving tone correctness in Thai speech synthesis based on an average voice model trained with nonprofessional speech corpus. The proposed technique utilizes quantized F0 symbols as the tonal context in order to obtain an appropriate F0 model. With this technique, the prosodic context can be extracted from real speech directly and this leads to prevent the inconsistency between speech data and F0 labels generated from transcription, which affects the naturalness and tone correctness in synthetic speech. We examine two types of tonal context labeling using the quantized F0 symbols based on phone and sub-phone boundaries. Experimental results of both objective and subjective tests show that the proposed technique can improve not only the naturalness but also the tone correctness of synthetic speech under condition of using a small amount speech data of nonprofessional target speakers.

Vataya Chunwijitra, Takashi Nose, Takao Kobayashi

Real-time Traffic

ICASSP 2011 | Signal Processing | Synthetic Speech | Tonal Context | Tone Correctness |

claim paper

Post Info
More Details (n/a)

Added	21 Aug 2011
Updated	21 Aug 2011
Type	Journal
Year	2011
Where	ICASSP
Authors	Vataya Chunwijitra, Takashi Nose, Takao Kobayashi

Comments (0)

Sciweavers

Tonal context labeling using quantized F0 symbols for improving tone correctness in average-voice-based speech synthesis

ICASSP 2011 | Signal Processing | Synthetic Speech | Tonal Context | Tone Correctness |

Explore & Download

Productivity Tools

Sciweavers