We propose a new approach for combining acoustic and visual measurements to aid in recognizing lip shapes of a person speaking. Our method relies on computing the maximum likeliho...
In this paper, we propose a generative model-based approach for audio-visual event classification. This approach is based on a new unsupervised learning method using an extended p...
Ming Li, Sanqing Hu, Shih-Hsi Liu, Sung Baang, Yu ...
In this paper, we propose a new manifold representation capable of being applied for visual speech recognition. In this regard, the real time input video data is compressed using P...
Dahai Yu, Ovidiu Ghita, Alistair Sutherland, Paul ...
Given a video and associated text, we propose an automatic annotation scheme in which we employ a latent topic model to generate topic distributions from weighted text and then mo...
Chris Engels, Koen Deschacht, Jan Hendrik Becker, ...
We propose a novel and generic video/image reranking algorithm, IB reranking, which reorders results from text-only searches by discovering the salient visual patterns of relevant...