In this paper, we propose an algorithm for sustained tracking of humans, where we combine frame-to-frame articulated motion estimation with a per-frame body detection algorithm. The proposed approach can automatically recover from tracking error and drift. The frame-to-frame motion estimation algorithm replaces traditional dynamic models within a filtering framework. Stable and accurate per-frame motion is estimated via an image-gradient based algorithm that solves a linear constrained least squares system. The per-frame detector learns appearance of different body parts and ‘sketches’ expected gradient maps to detect discriminant pose configurations in images. The resulting online algorithm is computationally efficient and has been widely tested on a large dataset of sequences of drivers in vehicles. It shows stability and sustained accuracy over thousands of frames.