This paper presents an online learning algorithm to construct from video sequences an image-based representation that is useful for recognition and tracking. For a class of objects (e.g., human faces), a generic representation of the appearances of the class is learned off-line. From video of an instance of this class (e.g., a particular person), an appearance model is incrementally learned on-line using the prior generic model and successive frames from the video. More specifically, both the generic and individual appearances are represented as an appearance manifold that is approximated by a collection of sub-manifolds (named pose manifolds) and the connectivity between them. In turn, each submanifold is approximated by a low-dimensional linear subspace while the connectivity is modeled by transition probabilities between pairs of sub-manifolds. We demonstrate that our online learning algorithm constructs an effective representation for face tracking, and its use in video-based face...
Kuang-Chih Lee, David J. Kriegman