High Five: Recognising human interactions in TV shows

13 years 11 months ago

Download bmvc10.dcs.aber.ac.uk

In this paper we address the problem of recognising interactions between two people in realistic scenarios for video retrieval purposes. We develop a per-person descriptor that uses attention (head orientation) and the local spatial and temporal context in a neighbourhood of each detected person. Using head orientation mitigates camera view ambiguities, while the local context, comprised of histograms of gradients and motion, aims to capture cues such as hand and arm movement. We also employ structured learning to capture spatial relationships between interacting individuals. We train an initial set of one-vs-the-rest linear SVM classifiers, one for each interaction, using this descriptor. Noting that people generally face each other while interacting, we learn a structured SVM that combines head orientation and the relative location of people in a frame to improve upon the initial classification obtained with our descriptor. To test the efficacy of our method, we have created a new d...

Alonso Patron, Marcin Marszalek, Andrew Zisserman,

Real-time Traffic

BMVC 2010 | Computer Vision | Head Orientation | Linear Svm Classifiers | Video Retrieval Purposes |

claim paper

Post Info
More Details (n/a)

Added	10 Feb 2011
Updated	10 Feb 2011
Type	Journal
Year	2010
Where	BMVC
Authors	Alonso Patron, Marcin Marszalek, Andrew Zisserman, Ian D. Reid

Comments (0)

Sciweavers

High Five: Recognising human interactions in TV shows

BMVC 2010 | Computer Vision | Head Orientation | Linear Svm Classifiers | Video Retrieval Purposes |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers