The visual surveillance task is to monitor the activity of objects in a scene. In far-field settings (i.e., wide outdoor areas), the majority of visible activities are objects moving from one location to another. Monitoring activity requires low-level detection, tracking, and classification of moving objects. Both high-level activity analysis and low-level vision can be improved with knowledge of scene structure (e.g., roads, paths, and entry and exit points). Scene knowledge supports activity descriptions with spatial context, such as "car moving off road," and "person waiting at bus stop." Scene information can also improve low-level tracking and classification. For example, if an object disappears, but not at an exit point, then it is likely a tracking failure instead of a true exit. In classification, we can leverage the fact that vehicles are much more likely than pedestrians to move on the road.
Hanqing Lu, Stan Z. Li, Tianzhu Zhang