Audio-visual event classification via spatial-temporal-audio words

15 years 1 months ago

Download zimmer.csufresno.edu

In this paper, we propose a generative model-based approach for audio-visual event classification. This approach is based on a new unsupervised learning method using an extended probabilistic Latent Semantic Analysis (pLSA) model. We represent each video clip as a collection of spatial-temporal-audio words, which are generated by fusing the visual and audio features using the pLSA model. Each audiovisual event class is treated as the latent topic in this model. The probability distributions of the spatialtemporal-audio words are learned from training examples, which include a sequence of videos that represent different types of audio-visual events. Experimental results show the effectiveness of the proposed approach.

Ming Li, Sanqing Hu, Shih-Hsi Liu, Sung Baang, Yu

Real-time Traffic

Audio-visual Event Classification | Computer Vision | Generative Model-based Approach | ICPR 2008 | Latent Semantic Analysis |

claim paper

Post Info
More Details (n/a)

Added	05 Nov 2009
Updated	05 Nov 2009
Type	Conference
Year	2008
Where	ICPR
Authors	Ming Li, Sanqing Hu, Shih-Hsi Liu, Sung Baang, Yu Cao

Comments (0)

Sciweavers

Audio-visual event classification via spatial-temporal-audio words

Audio-visual Event Classification | Computer Vision | Generative Model-based Approach | ICPR 2008 | Latent Semantic Analysis |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers