Tracking individuals in extremely crowded scenes is a challenging task, primarily due to the motion and appearance variability produced by the large number of people within the scene. The individual pedestrians, however, collectively form a crowd that exhibits a spatially and temporally structured pattern within the scene. In this paper, we extract this steady-state but dynamically evolving motion of the crowd and leverage it to track individuals in videos of the same scene. We capture the spatial and temporal variations in the crowd’s motion by training a collection of hidden Markov models on the motion patterns within the scene. Using these models, we predict the local spatio-temporal motion patterns that describe the pedestrian movement at each space-time location in the video. Based on these predictions, we hypothesize the target’s movement between frames as it travels through the local space-time volume. In addition, we robustly model the individual’s unique motion and appe...