This paper describes a general methodology for automated recognition of complex human activities. The methodology uses a context-free grammar (CFG) based representation scheme to represent composite actions and interactions. The CFG-based representation enables us to formally define complex human activities based on simple actions or movements. Human activities are classified into three categories: atomic action, composite action, and interaction. Our system is not only able to represent complex human activities formally, but also able to recognize represented actions and interactions with high accuracy. Image sequences are processed to extract poses and gestures. Based on gestures, the system detects actions and interactions occurring in a sequence of image frames. Our results show that the system is able to represent composite actions and interactions naturally. The system was tested to represent and recognize eight types of interactions: approach, depart, point, shake-hands, hug, p...
Michael S. Ryoo, J. K. Aggarwal