Generation of Speaker Mixture Voice using Spectrum Morphing

14 years 13 days ago

Download www.mega.t-kougei.ac.jp

We propose a method for synthesizing a “speaker mixture voice” that has both of two speakers’ individualities. We deﬁne the “speaker mixture voice” as such that 50 percent of the subjects who listen to the voice would identify either speaker A or speaker B in ABX listening test that instructs them to identify the speaker. To synthesize the speaker mixture voice, we parameterize the spectrum envelope with respect to peaks and valleys, ﬁnd the correspondence between two spectrum envelopes from independent speakers using DP matching, morph one to the other, and generate waveforms using TD-PSOLA[1]. Listening experiments showed 60 percent out of 56 synthesized voices were recognized as the speaker mixture voices. The primary application of the proposed method would be individualization of the characters in online games where multiple users play a single character as separate individuals at the same time.

Kohei Furuya, Tsuyoshi Moriyama, Shinji Ozawa

Real-time Traffic