Learning latent temporal structure for complex event detection

12 years 5 months ago

Download ai.stanford.edu

In this paper, we tackle the problem of understanding the temporal structure of complex events in highly varying videos obtained from the Internet. Towards this goal, we utilize a conditional model trained in a max-margin framework that is able to automatically discover discriminative and interesting segments of video, while simultaneously achieving competitive accuracies on difﬁcult detection and recognition tasks. We introduce latent variables over the frames of a video, and allow our algorithm to discover and assign sequences of states that are most discriminative for the event. Our model is based on the variable-duration hidden Markov model, and models durations of states in addition to the transitions between states. The simplicity of our model allows us to perform fast, exact inference using dynamic programming, which is extremely important when we set our sights on being able to process a very large number of videos quickly and efﬁciently. We show promising results on the O...

Kevin Tang, Fei-Fei Li, Daphne Koller

Real-time Traffic

Computer Vision | Cvpr 2012 | Exact Inference | Latent Variables | Variable Duration |

claim paper

Post Info
More Details (n/a)

Added	28 Sep 2012
Updated	28 Sep 2012
Type	Journal
Year	2012
Where	CVPR
Authors	Kevin Tang, Fei-Fei Li, Daphne Koller

Comments (0)

Sciweavers

Learning latent temporal structure for complex event detection

Computer Vision | Cvpr 2012 | Exact Inference | Latent Variables | Variable Duration |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers