Most existing techniques for analyzing face images assume that the faces are at near-frontal poses. Generalizing to non-frontal faces is often difficult, due to a dearth of groun...
—As it is true for human perception that we gather information from different sources in natural and multi-modality forms, learning from multi-modalities has become an effective ...
TalkMiner is a search engine for lecture webcasts. Lecture videos are processed to recover a set of distinct slide images and OCR is used to generate a list of indexable terms fro...
John Adcock, Matthew Cooper, Laurent Denoue, Hamed...
Modeling visual concepts using supervised or unsupervised machine learning approaches are becoming increasing important for video semantic indexing, retrieval, and filtering appli...
Easy-to-use audio/video authoring tools play a crucial role in moving multimedia software from research curiosity to mainstream applications. However, research in multimedia author...