This paper describes a method for tracking human body motion from multiple views in real-time. The method extracts silhouettes in each view using background subtraction, and then intersects the visual hulls generated by the silhouettes to create a set of voxels. The voxels then exert attractive forces on a kinematic model of the human body to align the model with the voxels. The linked nature of the model allows tracking of partially occluded limbs. The size parameters of the kinematic model are determined automatically during an initialization phase. The kinematic model also incorporates velocity, joint angle, and self collision limits. The entire system with four cameras runs on a single PC in real-time at 20 frames per second. Experiments are presented comparing the performance of the system on real and synthetic imagery to ground truth data.
Jason P. Luck, Christian Debrunner, William Hoff,