Visual information has been shown to improve the performance of speech recognition systems in noisy acoustic environments. However, most audio-visual speech recognizers rely on a ...
In this paper, we introduce a new histogram equalizationbased environmental model adaptation method for robust speech recognition in noise environments. The proposed method adapts...
We propose a novel multi-stream framework for continuous conversational speech recognition which employs bidirectional Long Short-Term Memory (BLSTM) networks for phoneme predicti...
We define the task of incremental or 0lag utterance segmentation, that is, the task of segmenting an ongoing speech recognition stream into utterance units, and present first resu...
Abstract--This paper presents a model for machine aided human translation (MAHT) that integrates source language text and target language acoustic information to produce the text t...