In this paper, we propose to use Artificial Neural Networks (ANN) for voice conversion. We have exploited the mapping abilities of ANN to perform mapping of spectral features of a source speaker to that of a target speaker. A comparative study of voice conversion using ANN and the state-of-the-art Gaussian Mixture Model (GMM) is conducted. The results of voice conversion evaluated using subjective and objective measures confirm that ANNs perform better transformation than GMMs and the quality of the transformed speech is intelligible and has the characteristics of the target speaker.
Srinivas Desai, E. Veera Raghavendra, B. Yegnanara