We consider the problem of learning how a person's face behaves in a long video sequence, with the aim of synthesising convincing sequences demonstrating the same behaviours. We describe a novel approach to segment a sequence into short sections, each representing a distinct action (or a part of an action). These sections are grouped and a model of the variability of the action learnt. A variable length Markov model is trained on the sequence of such actions to learn the temporal relationships. The result is a system that can generate realistic sequences of an individual face.
Franck Bettinger, Timothy F. Cootes, Christopher J