This paper presents an automatic video editing system based on head tracking for archiving meetings. Systems that archive meetings are attracting considerable interest. Conventional systems use a fixed-viewpoint camera and simple camera selection based on participants’ utterances. However, conventional systems fail to adequately convey who is talking to whom and nonverbal information about participants etc. We focus on the participants’ head orientation since this information is useful in detecting the speaker and who the speaker is talking to. In order to automatically estimate each participant’s head orientation, our system combines several modules to realize stereo-based head tracking. The system selects the shot of the participant that most participants are looking at, based on majority decision. Experiments on presenting videos to viewers confirm the effectiveness of our system in several 3-participant conversations.