This paper deals with the fully automatic extraction of classifiable person features out of a video stream with challenging background. Basically the task can be split in two parts: Tracking the object and extracting distinctive features. In order to track a person, a system composed of an Active Shape Model embedded in a particle filter framework has been built. The output - a shape representing the position and the geometry of the human's head - serves as an initial guess for the following Active Appearance Model, which enables high precision matching of the head's texture. In this way raw features are transformed into appearance parameters, which finally can be used for a variety of classification tasks. The novelty of this framework is the hierarchical combination using the similarities of the models as well as exploiting their differences to enhance robustness and performance in complex scenarios.