We present a novel approach to tracking 2D human motion in uncalibrated monocular videos. Human motion usually exhibits timevarying patterns, and we propose to use locally learnt prior models to capture this characteristics. For each input image, our method automatically learns a local probability density model and a local dynamical model from a set of training examples that are close matches to the input. We evaluate the image likelihood by matching a deformable 2D human body model to the input images. The local models and the image likelihood are integrated to optimize the pose for the current input. Experiments on both synthetic and real videos demonstrate the effectiveness of our method.