Tree-structured probabilistic models admit simple, fast inference. However, they are not well suited to phenomena such as occlusion, where multiple components of an object may disappear simultaneously. Mixtures of trees appear to address this problem, at the cost of representing a large mixture. We demonstrate an efficient and compact representation of this mixture, which admits simple learning and inference algorithms. We use this method to build an automated tracker for Muybridge sequences of a variety of human activities. Tracking is difficult, because the temporal dependencies rule out simple inference methods. We show how to use our model for efficient inference, using a method that employs alternate spatial and temporal inference. The result is a tracker that (a) uses a very loose motion model, and so can track many different activities at a variable frame rate and (b) is entirely automatic.
Sergey Ioffe, David A. Forsyth