Structured precision modelling is an important approach to improve the intra-frame correlation modelling of the standard HMM, where Gaussian mixture model with diagonal covariance...
Enriching a pronunciation dictionary with phonological variation is a challenging task, not yet solved despite several decades of research, in particular for speech-to-text transc...
This paper proposes a method to automatically extract highlight scenes from sports (baseball) live video in real time and to allow users to retrieve them. For this purpose, sophis...
We introduce a direct model for speech recognition that assumes an unstructured, i.e., flat text output. The flat model allows us to model arbitrary attributes and dependences o...
Georg Heigold, Geoffrey Zweig, Xiao Li, Patrick Ng...
This paper presents a new unit selection process for Very Low Bit Rate speech encoding around 500 bits/sec. The encoding is based on speech recognition and speech synthesis technol...