This paper investigates the semantic analysis of broadcast tennis footage. We consider the spatio-temporal behaviour of an object in the footage as being the embodiment of a semantic event. This object is tracked using a colour based particle filter. The video syntax and audio features are used to help delineate the temporal boundaries of these events. For broadcast tennis footage, the system firstly parses the video sequence based on the geometry of the content in view and classifies the clip as a particular view type. The temporal behaviour of the serving player is modelled using a HMM. As a result, each model is representative of a particular semantic episode. Events are then summarised using a number of synthesised keyframes.
Niall Rea, Rozenn Dahyot, Anil C. Kokaram