Vision Based Speech Animation Transferring with Underlying Anatomical Structure

14 years 8 months ago

Download www.cis.pku.edu.cn

We present a novel method to transfer speech animation recorded in low resolution videos onto realistic 3D facial models. Unsupervised learning is utilized on a speech video corpus to ﬁnd underlying manifold of facial conﬁgurations. K-means clustering is applied on the low dimensional space to ﬁnd key speaking-related facial shapes. With a small set of laser scanner captured 3D models related to the clustering centroid, the facial animation in 2D videos is transferred onto 3D shapes. Especially by virtue of a weak perspective projection model, the underlying mandible rotation is recovered from videos and is utilized to drive 3D skull movements. The adaption of a generic skull onto facial models is guided by a 2D image, Tissue Map. With parsimonious data requirements, our system realizes the animation transferring and gains a realistic rendering eﬀect with the underlying anatomical structure.

Yuru Pei, Hongbin Zha

Real-time Traffic