Dynamic Bayesian Networks (DBNs) have been widely studied in multi-modal speech recognition applications. Here, we introduce DBNs into an acoustically-driven talking face synthesi...
Jianxia Xue, Jonas Borgstrom, Jintao Jiang, Lynne ...
Led by the fundamental role that rhythms apparently play in speech and gestural communication among humans, this study was undertaken to substantiate a biologically motivated model...
While multimedia documents are sequentially presented to users, an information filtering (IF) system is useful to achieve a good retrieval performance in terms of both quality and ...
Dianhui Wang, Xiaodi Huang, Yong-Soo Kim, Joon Shi...
This paper proposes an approach to the problem of generating metadata for composite mixed-media digital objects by appropriately combining and exploiting existing knowledge or met...
Editing speech data is currently time-consuming and errorprone. Speech editors rely on acoustic waveform representations, which force users to repeatedly sample the underlying spe...