CereProc R Ltd. have recently released a beta version of a commercial unit selection synthesiser featuring XML control of speech style. The system is freely available for academic use and allows fine control of the rendered speech as well as full timings to interface with avatars and other animation. With reference to this system we will discuss current state-of-theart commercial expressive synthesis, and argue that underlying current approaches to sythesis, and current commercial pressures, make it difficult for many systems to create characterful synthesis. We will present how CereProc’s approach differs from the industry standard and how we have attempted to maintain and increase the characterfullness of CereVoice’s output. We will outline the expressive synthesis markup that is supported by the system, how these are expressed in underlying digital signal processing and selection tags. Finally we will present the concept of second pass synthesis where cues can be manually twea...
Matthew P. Aylett, Christopher J. Pidcock