We propose a new unsupervised learning method to obtain a layered pictorial structure (LPS) representation of an articulated object from video sequences. It will be seen that this is related in turn to methods for learning sprite based representations of an image. The method we describe involves a new generative model for performing segmentation on a set of images. Included in this model are the effects of motion blur and occlusion. An initial estimate of the parameters of the model is obtained by dividing the scene into rigidly moving components. The estimate of the matte of each part is refined using a variation of the -expansion graph cut algorithm. This method has the advantage of achieving a strong local minimum over labels. Results are demonstrated on animals for which an articulated LPS representation is naturally suited.
M. Pawan Kumar, Philip H. S. Torr, Andrew Zisserma