Facial animation has been combined with text-to-speech synthesis to create innovative multimodal interfaces. In this paper, we present an architecture for this multimodal interface. A face model is downloaded from a server into a client. The client uses an MPEG-4 compliant speech synthesizer that animates the head. The server sends text and animation data to the client in addition to regular content to be displayed in a web browser. We believe that this architecture can support electronic commerce by providing a more friendly, helpful and intuitive user interface when compared to a regular web browser. In order to substantiate these claims, we undertook experiments to understand user reaction to interactive services designed with synthetic characters. In one experiment, participants played the 'Social Dilemma' game with the computer as a partner. Results indicate that users cooperate more with a computer when an animated face is representing the computer during the game. A s...
Jörn Ostermann, David R. Millen