Modeling visual concepts using supervised or unsupervised machine learning approaches are becoming increasing important for video semantic indexing, retrieval, and filtering appli...
For the purpose of Multimodal Meeting Manager Project (M4), an approach based on face and a hand tracking is proposed. The technique essentially includes skin color detection, seg...
Finding faces in visually challenging environments is crucial to many applications, such as audio-visual automatic speech recognition, video indexing, person recognition, and vide...
The automatic extraction of sports video highlights is a typical kind of personalized media production process. Many ways have been studied from the viewpoints of lowlevel audio/v...
We present an approach for tracking a lecturer during the course of his speech. We use features from multiple cameras and microphones, and process them in a joint particle filter f...
Kai Nickel, Tobias Gehrig, Hazim Kemal Ekenel, Joh...