The audio scene from broadcast soccer can be used for identifying highlights from the game. Audio cues derived from these sources provide valuable information about game events, as can the detection of key words used by the commentators. In this paper we interpret the feasibility of incorporating both commentator word recognition and information about the additive background noise in an HMM structure. A limited set of audio cues, which have been extracted from data collected from the 2006 FIFA World Cup, are used to create an extension to the Aurora-2 database. The new database is then tested with various PMC models and compared to the standard baseline, clean and multi-condition training methods. It is found that incorporating SNR and noise type information into the PMC process is beneficial to recognition performance.
Jack H. Longton, Philip J. B. Jackson