We present a generic and robust method for model-based global 3D head pose estimation in monocular and non-calibrated video sequences. The proposed method relies on a 3D/2D matching between 2D image features estimated throughout the sequence and 3D object features of a generic head model. Specifically, it combines motion and texture features in an iterative optimization procedure based on the downhill simplex algorithm. A proper initialization of the pose parameters, based on a block matching procedure, is performed at each frame in order to take into account large amplitude motions. For the same reason, we have developed a non-linear optical flow-based interpolation algorithm for increasing the frame rate. Experiments demonstrate that this method is stable over extended sequences including large head motions, occlusions, various head postures and lighting variations. The estimation accuracy is related to the head model, as established by using an ellipsoidal model and an ad hoc synth...
Marius Malciu, Françoise J. Prêteux