The multimodal nature of speech is often ignored in human-computer interaction, but lip deformations and other body motion, such as those of the head, convey additional information...
Iain Matthews, Timothy F. Cootes, J. Andrew Bangha...
Orthogonal information present in the video signal associated with the audio helps in improving the accuracy of a speech recognition system. Audio-visual speech recognition involv...
Tanveer A. Faruquie, Abhik Majumdar, Nitendra Rajp...
This paper describes an algorithm that performs a simple form of computational auditory scene analysis to separate multiple speech signals from one another on the basis of the mod...
This paper presents a method for automatic multimodal person authentication using speech, face and visual speech modalities. The proposed method uses the motion information to loc...
Extraction of bilingual audio and text data is crucial for designing Speech to Speech (S2S) systems. In this work, we propose an automatic method to segment multilingual audio str...
Andreas Tsiartas, Prasanta Kumar Ghosh, Panayiotis...