This paper addresses the 3D tracking of pose and animation of the human face in monocular image sequences using deformable 3D models. For each frame, the proposed adaptation is split into two consecutive stages: global and local. In the first stage, the 3D pose of the face is recovered using a RANSAC-based technique involving both the consensus measure and the consistency with a statistical model of a face texture. In the second stage, the local motion associated with some facial features is recovered using the concept of the active appearance model search. Adaptation examples demonstrate the feasibility and robustness of the developed framework.