In this paper we describe our TRECVID 2007 experiments. The MediaMill team participated in two tasks: concept detection and search. For concept detection we extract regionbased im...
Cees G. M. Snoek, I. Everts, Jan van Gemert, Jan-M...
In this paper, we present a novel approach for tracking a lecturer during the course of his speech. We use features from multiple cameras and microphones, and process them in a jo...
Kai Nickel, Tobias Gehrig, Rainer Stiefelhagen, Jo...
This paper presents a bottom-up approach that combines audio and video to simultaneously locate individual speakers in the video (2-D source localization) and segment their speech ...
This paper presents a novel approach for generating and analyzing epipolar plane images (EPIs) from video sequences taken from a moving platform subject to vibration so that the 3...
Computational models of grounded language learning have been based on the premise that words and concepts are learned simultaneously. Given the mounting cognitive evidence for conc...