A computer vision system for tracking multiple people in relatively unconstrained environments is described. Trackerformed at three levels of abstraction: regions, people and groups. A novel, adaptive background subtraction method that combines colour and gradient information is used to cope with shadows and unreliable colour cues. People are tracked through mutual occlusions as they form groups and part from one another. Strong use is made of colour information to disambiguate occlusions and to provide qualitative estimates of depth ordering and position during occlusion. Some simple interactions with objects can also be detected. The system is tested using indoor and outdoor sequences. It is robust and should provide a useful mechanism for boot-strapping and reinitialisation of tracking using more specific but less robust human models.
Stephen J. McKenna, Sumer Jabri, Zoran Duric, Harr