Semantic detection and recognition of objects and events contained in a video stream has to be performed in order to provide content-based annotation and retrieval of videos. This annotation is done as a means to be able to reuse the video material at a later stage, e.g. to produce new TV programmes. A typical example is that of sports videos, where videos are annotated in order to reuse the video clips that show key highlights and players to produce short summaries for news and sports programmes. In order to select the most interesting actions among all the possibly detected highlights further analysis is required; i.e. the shots that contain a key action are typically followed by close-ups of the players that take part in the action. Therefore the automatic identification of these players would add considerable value both to the annotation and retrieval of the key highlights and key players of a sport event. The problem of detecting and recognizing faces in broadcast videos is a wid...