

Large Vocabulary Audio-Visual Speech Recognition Using Active Shape Models

15 years 2 months ago
Large Vocabulary Audio-Visual Speech Recognition Using Active Shape Models
Orthogonal information present in the video signal associated with the audio helps in improving the accuracy of a speech recognition system. Audio-visual speech recognition involves extraction of both the audio as well as visual features from the input signal. Extraction of visual parameters is done by the recognition of speech dependent features from the video sequence. This paper uses geometrical features to describe the lip shapes. Curve-based Active Shape Models are used to extract the geometry. These geometrically represented visual parameters are used along with the audio cepstral features to perform an audio-visual classification. It is shown that the bimodal system presented here gives an improvement in the classification results over classification using only the audio features.
Tanveer A. Faruquie, Abhik Majumdar, Nitendra Rajp
Added 09 Nov 2009
Updated 09 Nov 2009
Type Conference
Year 2000
Where ICPR
Authors Tanveer A. Faruquie, Abhik Majumdar, Nitendra Rajput, L. Venkata Subramaniam
Comments (0)