Sciweavers

TOMCCAP
2010

Audio-visual atoms for generic video concept classification

13 years 7 months ago
Audio-visual atoms for generic video concept classification
We investigate the challenging issue of joint audio-visual analysis of generic videos targeting at semantic concept detection. We propose to extract a novel representation, the Short-term Audio-Visual Atom (S-AVA), for improved concept detection. An S-AVA is defined as a short-term region track associated with regional visual features and background audio features. An effective algorithm, named ShortTerm Region tracking with joint Point Tracking and Region Segmentation (STR-PTRS), is developed to extract S-AVAs from generic videos under challenging conditions such as uneven lighting, clutter, occlusions, and complicated motions of both objects and camera. Discriminative audio-visual codebooks are constructed on top of S-AVAs using Multiple Instance Learning. Codebook-based features are generated for semantic concept detection. We extensively evaluate our algorithm over Kodak's consumer benchmark video set from real users. Experimental results confirm significant performance impro...
Wei Jiang, Courtenay V. Cotton, Shih-Fu Chang, Dan
Added 22 May 2011
Updated 22 May 2011
Type Journal
Year 2010
Where TOMCCAP
Authors Wei Jiang, Courtenay V. Cotton, Shih-Fu Chang, Dan Ellis, Alexander C. Loui
Comments (0)