Fundamental frequency modeling using wavelets for emotional voice conversion

8 years 10 months ago

Download www.infomus.org

Abstract—This paper is to show a representation of fundamental frequency (F0) using continuous wavelet transform (CWT) for prosody modeling in emotion conversion. Emotional conversion aims at converting speech from one emotion state to another. Speciﬁcally, we use CWT to decompose F0 into a ﬁve-scale representation that corresponds to ﬁve temporal scales. A neutral voice is converted to an emotional voice under an exemplarbased voice conversion framework, where both spectrum and F0 are simultaneously converted. The simulation results demonstrate that the dynamics of F0 in different temporal scales can be well captured and converted using the ﬁve-scale CWT representation. The converted speech signals are evaluated both objectively and subjectively, that conﬁrm the effectiveness of the proposed method. Keywords—Voice conversion; prosody; sparse representation; emotion

Real-time Traffic

ACII 2015 | Applied Computing |

claim paper

Post Info
More Details (n/a)

Added	13 Apr 2016
Updated	13 Apr 2016
Type	Journal
Year	2015
Where	ACII

Comments (0)

Sciweavers

Fundamental frequency modeling using wavelets for emotional voice conversion

ACII 2015 | Applied Computing |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers