To learn the preferential visual attention given by humans to specific image content, we present NUSEF- an eye fixation database compiled from a pool of 758 images and 75 subjects....
Video provides not only rich visual cues such as motion and appearance, but also much less explored long-range temporal interactions among objects. We aim to capture such interact...
José, Lezama, Karteek Alahari, Josef Sivic, Ivan ...
Occlusion and lack of visibility in dense crowded scenes make it very difficult to track individual people correctly and consistently. This problem is particularly hard to tackle i...
We propose a method based on sparse representation
(SR) to cluster data drawn from multiple low-dimensional
linear or affine subspaces embedded in a high-dimensional
space. Our ...
Abstract. Can we discover audio-visually consistent events from videos in a totally unsupervised manner? And, how to mine videos with different genres? In this paper we present our...