— Common human actions are instantly recognizable by people and increasingly machines need to understand this language if they are to engage smoothly with people. Here we introduce a new method for automated human action recognition. The proposed method represents videos as a tangent bundle on a Grassmann manifold. Videos are expressed as third order tensors and factorized to a set of tangent spaces. Tangent vectors are then computed between elements on a Grassmann manifold and exploited for action classification. In particular, logarithmic mapping is applied to map a point from the manifold to tangent vectors centered at a given element. The canonical metric is used to induce the intrinsic distance for a set of tangent spaces. Empirical results show that our method is effective on both uniform and non-uniform backgrounds for action classification. We achieve recognition rates of 91% on the Cambridge gesture dataset, 88% on the UCF sport dataset, and 97% on the KTH human action dat...
Yui Man Lui, J. Ross Beveridge