This paper tackles the problem of surveillance video content modelling. Given a set of surveillance videos, the aims of our work are twofold: firstly a continuous video is segment...
Anecdotal evidence suggests that story-level information is important for the speech component of video retrieval. In this paper we perform a systematic examination of the combina...
We propose a multi-modal object tracking algorithm that combines appearance, motion and audio information in a particle filter. The proposed tracker fuses at the likelihood level ...
In this paper, we present a novel approach for tracking a lecturer during the course of his speech. We use features from multiple cameras and microphones, and process them in a jo...
Kai Nickel, Tobias Gehrig, Rainer Stiefelhagen, Jo...
Most video retrieval systems are multimodal, commonly relying on textual information, low- and high-level semantic features extracted from query visual examples. In this work, we ...