In this paper, we describe a method to enhance the readability of out-of-vocabulary items (OOVs) in the textual output in a large vocabulary continuous speech recognition system. ...
Bart Decadt, Jacques Duchateau, Walter Daelemans, ...
We introduce a direct model for speech recognition that assumes an unstructured, i.e., flat text output. The flat model allows us to model arbitrary attributes and dependences o...
Georg Heigold, Geoffrey Zweig, Xiao Li, Patrick Ng...
Untethered multimodal interfaces are more attractive than tethered ones because they are more natural and expressive for interaction. Such interfaces usually require robust vision...
The interaction between human beings and computers will be more natural if computers are able to perceive and respond to human non-verbal communication such as emotions. Although ...
Carlos Busso, Zhigang Deng, Serdar Yildirim, Murta...
Local business voice search is a popular application for mobile phones, where hands-free interaction and speed are critical to users. However, speech recognition accuracy is still...
Giuseppe Di Fabbrizio, Diamantino Caseiro, Amanda ...