The quality of static phones (e.g. vowels, fricatives, nasals, laterals) generated by articulatory speech synthesizers has reached a high level in the last years. Our goal is to ex...
We consider a hierarchical two-layer model of natural signals in which both layers are learned from the data. Estimation is accomplished by Score Matching, a recently proposed est...
Virtual conversational agents are supposed to combine speech with nonverbal modalities for intelligible and believeable utterances. However, the automatic synthesis of coverbal ge...
Previously we have proposed different models for estimating articulatory gestures and vocal tract variable (TV) trajectories from synthetic speech. We have shown that when deploye...
Vikramjit Mitra, Hosung Nam, Carol Y. Espy-Wilson,...