A perceptual human-machine interface based on visual appearence of hand movements is presented. Gestures are defined as the temporal evolution of 3D poses the of user's hand. Each gesture is described by five temporal sequences corresponding to the time evolution of the stereo depth information at the finger-tips. To this end feature processing directly exploits binocular disparity information considered in its full spatio-temporal dimension. Hidden Markov Models are used to represent the statistical properties of the gesture. The high recognition rate obtained by experimental testing on a five gesture alphabet validates the approach.
Giulia Gastaldi, Alessandro Pareschi, Silvio P. Sa