Inferences from time-series data can be greatly enhanced by taking into account multiple modalities. In some cases, such as audio of speech and the corresponding video of lip gestures, the di erent time-series are tightly coupled. We are interested in loosely-coupled time series where only the onset of events are coupled in time. We present an extension of the forward-backward algorithm that can be used for inference and learning in event-coupled hidden Markov models and give results on a simpli ed multi-media indexing task where the objective is to detect an event whose onset is loosely coupled in audio and video. Submitted to NIP98, Algorithms and Architectures, poster presentation. 1 Foreground The combination of multiple modalities for inference has proven to be a very powerful way to increase detection and recognition performance (Yuhas et al. 1988; Becker and Hinton 1992; Bregler et al. 1994; de Sa and Ballard 1998). By combining the soft information provided by models of the di...
Trausti T. Kristjansson, Brendan J. Frey, Thomas S