In this paper we present a new method for mapping a natural speech to the lip shape animation in the real time. The speech signal, represented by MFCC vectors, is classified into viseme classes using neural networks. The topology of neural networks is automatically configured using genetic algorithms. This eliminates the need for tedious manual neural network design by trial and error and considerably improves the viseme classification results. This method is suitable for real-time and offline applications.
Goranka Zoric, Igor S. Pandzic