Action snippets: How many frames does human action recognition require?

15 years 2 months ago

Download ftp.vision.ee.ethz.ch

Visual recognition of human actions in video clips has been an active field of research in recent years. However, most published methods either analyse an entire video and assign it a single action label, or use relatively large lookahead to classify each frame. Contrary to these strategies, human vision proves that simple actions can be recognised almost instantaneously. In this paper, we present a system for action recognition from very short sequences ("snippets") of 1?10 frames, and systematically evaluate it on standard data sets. It turns out that even local shape and optic flow for a single frame are enough to achieve 90% correct recognitions, and snippets of 5-7 frames (0.3-0.5 seconds of video) are enough to achieve a performance similar to the one obtainable with the entire video sequence.

Konrad Schindler, Luc J. Van Gool

Real-time Traffic

Computer Vision | CVPR 2008 | Entire Video Sequence | Human Actions | Single Action Label | Single Frame | Standard Data Sets |

claim paper

Post Info
More Details (n/a)

Added	12 Oct 2009
Updated	28 Oct 2009
Type	Conference
Year	2008
Where	CVPR
Authors	Konrad Schindler, Luc J. Van Gool

Comments (0)

Sciweavers

Action snippets: How many frames does human action recognition require?

Computer Vision | CVPR 2008 | Entire Video Sequence | Human Actions | Single Action Label | Single Frame | Standard Data Sets |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers