When exposed to environmental noise, speakers adjust their speech production to maintain intelligible communication. This phenomenon, called Lombard effect (LE), is known to consi...
An image-based approach provides an efficient way for visual speech synthesis. In an image-based visual speech synthesis system, a few lip images, namely visemes, are used for ge...
This paper looks at a parsing-based alternative to word error rate (WER) for optimizing recognition, SParseval, hypothesizing that it may be a better objective for applications su...
Dustin Hillard, Mei-Yuh Hwang, Mary P. Harper, Mar...
This demonstration involves two-way automatic speechto-speech translation on a consumer off-the-shelf PDA. This work was done as part of the DARPA-funded Babylon project, investig...
Alex Waibel, Ahmed Badran, Alan W. Black, Robert E...
Statistical machine translation (SMT) systems for spoken languages suffer from conversational speech phenomena, in particular, the presence of speech dis uencies. We examine the i...