Complex human activities occurring in videos can be defined in terms of temporal configurations of primitive actions. Prior work typically hand-picks the primitives, their total...
Computational models of grounded language learning have been based on the premise that words and concepts are learned simultaneously. Given the mounting cognitive evidence for conc...
Traditional aspect graphs are topology-based and are impractical for articulated objects. In this work we learn a small number of aspects, or prototypical views, from video data. ...
We consider the `group motion segmentation' problem and provide a solution for it. The group motion segmentation problem aims at analyzing motion trajectories of multiple obj...
Recently, the generative modeling approach to video segmentation has been gaining popularity in the computer vision community. For example, the flexible sprites framework has been...