Sciweavers

ICASSP
2011
IEEE

A frame mapping based HMM approach to cross-lingual voice transformation

13 years 3 months ago
A frame mapping based HMM approach to cross-lingual voice transformation
Cross-lingual voice transformation is challenging when source language (L1) and target language (L2) are very different in corresponding phonetics and prosodies. We propose a frame mapping based HMM approach to this problem. The source speaker’s speech data is first warped in frequency toward the target speaker by mapping corresponding formants of selected vowels. The parameter trajectories of the warped data are then “tiled” with the frames in target speaker’s L2 data. The tiled new trajectories then form a simulated training set of target speaker in L1 and it is used to train an HMM TTS. With a bilingual (Mandarin and English) source speaker and a monolingual (English) target speaker, the frame mapping-based approach is capable of generating highly intelligible, good quality speech data in L1 (Mandarin), which sounds rather close to the target speaker. The good performance of the cross-lingual voice transformation is confirmed with speaker similarity, naturalness and intelli...
Yao Qian, Ji Xu, Frank K. Soong
Added 20 Aug 2011
Updated 20 Aug 2011
Type Journal
Year 2011
Where ICASSP
Authors Yao Qian, Ji Xu, Frank K. Soong
Comments (0)