We propose a novel hierarchical model of human dynamics for view independent tracking of the human body in monocular video sequences. The model is trained using real data from a collection of people. Kinematics are encoded using Hierarchical Principal Component Analysis, and dynamics are encoded using Hidden Markov Models. The top of the hierarchy contains information about the whole body. The lower levels of the hierarchy contain more detailed information about possible poses of some subpart of the body. When tracking, the lower levels of the hierarchy are shown to improve accuracy. In this article we describe our model and present experiments that show we can recover 3D skeletons from 2D images in a view independent manner, and also track people the system was not trained on.
I. A. Karaulova, Peter M. Hall, A. David Marshall