This paper proposes an approach to 2D gesture recognition that models each gesture as a Finite State Machine (FSM) in spatial-temporal space. The model construction works in a semi-automatic way. The structure of the model is first manually decided based on the observation of the spatial topology of the data. The model is refined iteratively between two stages: data segmentation and model training. Given the continuous training data of a single gesture, we roughly segment the gesture trajectory into phrases using the spatial information alone. The segmentation results are used to initialize an FSM. The model is used to re-segment the data. The results of the re-segmentation are used to refine the parameters of the model. After the FSM is trained, we incorporate a modified Knuth-Morris-Pratt algorithm into the FSM recognition procedure to speed up the gesture recognition. The computational efficiency of the FSM recognizers allows real-time on-line performance to be achieved.
Pengyu Hong, Thomas S. Huang, Matthew Turk