We propose to improve speech recognition performance on speaker-independent, mixed language speech by asymmetric acoustic modeling. Mixed language is either inter-sentential code ...
A layered approach to information retrieval permits the inclusion of multiple search engines as well as multiple databases, with a natural language layer to convert English querie...
This paper presents two different approaches to automatic captioning of geo-tagged images by summarizing multiple web-documents that contain information related to an image’s lo...
This paper describes the design and evaluation of prosodically-sensitive concatenative units for a Korean text-to-speech (TTS) synthesis system. The diphones used are prosodically...
—Speaking using unconstrained natural language is an intuitive and flexible way for humans to interact with robots. Understanding this kind of linguistic input is challenging be...
Thomas Kollar, Stefanie Tellex, Deb Roy, Nicholas ...