We develop a classification algorithm for hybrid autoregressive models of human motion for the purpose of videobased analysis and recognition. We assume that some temporal statistics are extracted from the images, and we use them to infer a dynamical system that explicitly models contact forces. We then develop a distance between such models that explicitly factors out exogenous inputs that are not unique to an individual or her gait. We show that such a distance is more discriminative than the distance between simple linear systems, where most of the energy is devoted to modeling the dynamics of spurious nuisances such as contact forces.