Sciweavers

ICCV
2005
IEEE

Visual Speech Recognition with Loosely Synchronized Feature Streams

14 years 5 months ago
Visual Speech Recognition with Loosely Synchronized Feature Streams
We present an approach to detecting and recognizing spoken isolated phrases based solely on visual input. We adopt an architecture that first employs discriminative detection of visual speech and articulatory features, and then performs recognition using a model that accounts for the loose synchronization of the feature streams. Discriminative classifiers detect the subclass of lip appearance corresponding to the presence of speech, and further decompose it into features corresponding to the physical components of articulatory production. These components often evolve in a semi-independent fashion, and conventional visemebased approaches to recognition fail to capture the resulting co-articulation effects. We present a novel dynamic Bayesian network with a multi-stream structure and observations consisting of articulatory feature classifier scores, which can model varying degrees of co-articulation in a principled way. We evaluate our visual-only recognition system on a command utt...
Kate Saenko, Karen Livescu, Michael Siracusa, Kevi
Added 24 Jun 2010
Updated 24 Jun 2010
Type Conference
Year 2005
Where ICCV
Authors Kate Saenko, Karen Livescu, Michael Siracusa, Kevin Wilson, James R. Glass, Trevor Darrell
Comments (0)