Sciweavers

18 search results - page 3 / 4
» Combining Visual and Acoustic Speech Signals with a Neural N...
Sort
View
PAMI
2002
98views more  PAMI 2002»
13 years 7 months ago
Extraction of Visual Features for Lipreading
The multimodal nature of speech is often ignored in human-computer interaction, but lip deformations and other body motion, such as those of the head, convey additional information...
Iain Matthews, Timothy F. Cootes, J. Andrew Bangha...
ICMI
2004
Springer
281views Biometrics» more  ICMI 2004»
14 years 1 months ago
Articulatory features for robust visual speech recognition
Visual information has been shown to improve the performance of speech recognition systems in noisy acoustic environments. However, most audio-visual speech recognizers rely on a ...
Kate Saenko, Trevor Darrell, James R. Glass
ESANN
2007
13 years 9 months ago
A hierarchical model for syllable recognition
Inspired by recent findings on the similarities between the primary auditory and visual cortex we propose a neural network for speech recognition based on a hierarchical feedforw...
Xavier Domont, Martin Heckmann, Heiko Wersing, Fra...
TSD
2007
Springer
14 years 1 months ago
Inter-speaker Synchronization in Audiovisual Database for Lip-Readable Speech to Animation Conversion
The present study proposes an inter-speaker audiovisual synchronization method to decrease the speaker dependency of our direct speech to animation conversion system. Our aim is to...
Gergely Feldhoffer, Balázs Oroszi, Gyö...
ICASSP
2011
IEEE
12 years 11 months ago
A multi-stream ASR framework for BLSTM modeling of conversational speech
We propose a novel multi-stream framework for continuous conversational speech recognition which employs bidirectional Long Short-Term Memory (BLSTM) networks for phoneme predicti...
Martin Wöllmer, Florian Eyben, Björn Sch...