We propose an unsupervised image segmentation method based on texton similarity and mode seeking. The input image is first convolved with a filter-bank, followed by soft cluster...
Automated facial expression recognition has received increased attention over the past two decades. Existing works in the field usually do not encode either the temporal evolutio...
Activity recognition in video is dominated by low- and mid-level features, and while demonstrably capable, by nature, these features carry little semantic meaning. Inspired by the...
In this paper, we raise important issues on scalability and the required degree of supervision of existing Mahalanobis metric learning methods. Often rather tedious optimization p...
This paper addresses the discovery of activities and learns the underlying processes that govern their occurrences over time in complex surveillance scenes. To this end, we propos...
Fine-grained categorization refers to the task of classifying objects that belong to the same basic-level class (e.g. different bird species) and share similar shape or visual app...
We present a generic framework for object segmentation using depth maps based on Random Forest and Graph-cuts theory, and apply it to the segmentation of human limbs in depth maps...
This paper presents a novel method for estimating the geospatial trajectory of a moving camera with unknown intrinsic parameters, in a city-scale urban environment. The proposed m...
Gonzalo Vaca-Castano, Amir Roshan Zamir, Mubarak S...
Trajectory basis Non-Rigid Structure From Motion (NRSFM) currently faces two problems: the limit of reconstructability and the need to tune the basis size for different sequences....
A novel method is proposed for matching articulated objects in cluttered videos. The method needs only a single exemplar image of the target object. Instead of using a small set o...