In this paper we describe a recognition approach based on the notion of primitives. As opposed to recognizing actions based on temporal trajectories or temporal volumes, primitive-based recognition is based on representing a temporal sequence containing an action by only a few characteristic time instances. The human whereabouts at these instances are extracted by double difference images and represented by four features. In each frame the primitive, if any, that best explains the observed data is identified. This leads to a discrete recognition problem since a video sequence will be converted into a string containing a sequence of symbols, each representing a primitives. After pruning the string a probabilistic Edit Distance classifier is applied to identify which action best describes the pruned string. The approach is evaluated on five one-arm gestures and the recog
Preben Fihl, Michael B. Holte, Thomas B. Moeslund,