This paper describes a new approach to modeling duration for LVCSR using SCARF, a toolkit for speech recognition with segmental conditional random fields. We utilize SCARF’s abi...
Multimodal grammars provide an expressive formalism for multimodal integration and understanding. However, handcrafted multimodal grammars can be brittle with respect to unexpecte...
In order to improve the flexibility and the precision of an automatic phone segmentation system for a type of expressive speech, the dubbing into French of fiction movies, we deve...
Truecasing is the process of restoring case information to badly-cased or noncased text. This paper explores truecasing issues and proposes a statistical, language modeling based ...
Lucian Vlad Lita, Abraham Ittycheriah, Salim Rouko...
This paper presents a method for automatic multimodal person authentication using speech, face and visual speech modalities. The proposed method uses the motion information to loc...