Head pose plays a special role in estimating a presenter’s focuses and actions for lecture video editing. This paper presents an efficient and robust head pose estimation algorithm to cope with the new challenges arising in the content management of lecture videos. These challenges include speed requirement, low video quality, variant presenting styles and complex settings in modern classrooms. Our algorithm is based on a robust hierarchical representation of skin color clustering and a set of pose templates that are automatically trained. Contextual information is also considered to refine pose estimation. Most importantly, we propose an online learning approach to deal with different presenting styles, which has not been addressed before. We show that the proposed approach can significantly improve the performance of pose estimation. In addition, we also describe how posture is used in focus estimation for lecture video editing by integrating with gesture. Categories and Subjec...