Human motion capture has lately been the object of much attention due to commercial interests. A ”touch free” computer vision solution to the problem is desirable to avoid the intrusiveness of standard capture devices. The object to be monitored is known a priori which suggest to include a human model in the capture process. In this paper we use a model-based approach known as the analysis-bysynthesis approach. This approach is powerful but has a problem with its potential huge search space. Using multiple cues we reduce the search space by introducing constraints through the 3D locations of salient points and a silhouette of the subject. Both data types are relatively easy to derive and only require limited computational effort so the approach remains suitable for real-time applications. The approach is tested on 3D movements of a human arm and the results show that we successfully can estimate the pose of the arm using the reduced search space.
Thomas B. Moeslund, Erik Granum