We propose an approach for learning visual models of object categories in an unsupervised manner in which we first build a large-scale complex network which captures the interacti...
Recent approaches to action classification in videos have used sparse spatio-temporal words encoding local appearance around interesting movements. Most of these approaches use a ...
We present a novel stochastic, adaptive strategy for tracking multiple people in a large network of video cameras. Similarities between features (appearance and biometrics) observ...
We show how to use a sampling method to find sparsely clad people in static images. People are modeled as an assembly of nine cylindrical segments. Segments are found using an EM ...
Abstract. The use of sparse invariant features to recognise classes of actions or objects has become common in the literature. However, features are often "engineered" to...