We present algorithms for recognizing human motion in monocular video sequences, based on discriminative Conditional Random Field (CRF) and Maximum Entropy Markov Models (MEMM). Existing approaches to this problem typically use generative (joint) structures like the Hidden Markov Model (HMM). Therefore they have to make simplifying, often unrealistic assumptions on the conditional independence of observations given the motion class labels and cannot accommodate overlapping features or long term contextual dependencies in the observation sequence. In contrast, conditional models like the CRFs seamlessly represent contextual dependencies, support efficient, exact inference using dynamic programming, and their parameters can be trained using convex optimization. We introduce conditional graphical models as complementary tools for human motion recognition and present an extensive set of experiments that show how these typically outperform HMMs in classifying not only diverse human activit...
Cristian Sminchisescu, Atul Kanaujia, Dimitris N.