Sciweavers

ICPR
2008
IEEE

Audio-visual event classification via spatial-temporal-audio words

15 years 22 days ago
Audio-visual event classification via spatial-temporal-audio words
In this paper, we propose a generative model-based approach for audio-visual event classification. This approach is based on a new unsupervised learning method using an extended probabilistic Latent Semantic Analysis (pLSA) model. We represent each video clip as a collection of spatial-temporal-audio words, which are generated by fusing the visual and audio features using the pLSA model. Each audiovisual event class is treated as the latent topic in this model. The probability distributions of the spatialtemporal-audio words are learned from training examples, which include a sequence of videos that represent different types of audio-visual events. Experimental results show the effectiveness of the proposed approach.
Ming Li, Sanqing Hu, Shih-Hsi Liu, Sung Baang, Yu
Added 05 Nov 2009
Updated 05 Nov 2009
Type Conference
Year 2008
Where ICPR
Authors Ming Li, Sanqing Hu, Shih-Hsi Liu, Sung Baang, Yu Cao
Comments (0)