Sciweavers

INTERSPEECH
2010

Comparison of approaches for instrumentally predicting the quality of text-to-speech systems

13 years 7 months ago
Comparison of approaches for instrumentally predicting the quality of text-to-speech systems
In this paper, we compare and combine different approaches for instrumentally predicting the perceived quality of Text-to-Speech systems. First, a log-likelihood is determined by comparing features extracted from the synthesized speech signal with features trained on natural speech. Second, parameters are extracted which capture quality-relevant degradations of the synthesized speech signal. Both approaches are combined and evaluated on three auditory test databases. The results show that auditory quality judgments can in many cases be predicted with a sufficiently high accuracy and reliability, but that there are considerable differences, mainly between male and female speech samples.
Sebastian Möller, Florian Hinterleitner, Tiag
Added 18 May 2011
Updated 18 May 2011
Type Journal
Year 2010
Where INTERSPEECH
Authors Sebastian Möller, Florian Hinterleitner, Tiago H. Falk, Tim Polzehl
Comments (0)