The quest for a vision system capable of representing and recognizing arbitrary motions benefits from a low dimensional, non-specific representation of flow fields, to be used in high level classification tasks. We present Zernike polynomials as an ideal candidate for such a representation. The basis of Zernike polynomials is complete and orthogonal, and can be used for describing many types of motion at many scales. Starting from image sequences, locally smooth image velocities are derived using a robust estimation procedure, from which are computed compact representations of the flow using the Zernike basis. Continuous density hidden Markov models are trained using the temporal sequences of vectors thus obtained, and are used for subsequent classification. We present results of our method applied to image sequences of facial expressions both with and without significant rigid head motion and to sequences of lip motion from a known database. We demonstrate that the Zernike representa...
Jesse Hoey, James J. Little