This paper presents an adaptive framework for live video analysis. The activities of surveillance subjects are described using a spatio-temporal vocabulary learned from recurrent motion patterns. The repetitive nature of object trajectories are used to build a topographical map, where nodes are points of interest and the edges correspond to activities, to describe a scene. The graph is learned in an unsupervised manner but is flexible and able to adjust to changes in the environment or other scene variations.
Brendan Morris, Mohan M. Trivedi