We suggest an approach to speech recognition where multiple sides of a conversation in a dialog or meeting are processed and decoded jointly rather than independently. We moreover...
We propose a novel multi-stream framework for continuous conversational speech recognition which employs bidirectional Long Short-Term Memory (BLSTM) networks for phoneme predicti...
Abstract--This paper is concerned with the automatic recognition of dialogue acts (DAs) in multiparty conversational speech. We present a joint generative model for DA recognition ...
—This paper introduces a novel contextual model for the recognition of people’s visual focus of attention (VFOA) in meetings from audio-visual perceptual cues. More specificall...
Human movements are important cues for recognizing human actions, which can be captured by explicit modeling and tracking of actor or through space-time low-level features. Howeve...