In this paper a exible, high-throughput, low-complexity additive white gaussian noise (AWGN) channel generator is presented. The proposed generator employs a Mersenne-Twister to g...
Speech translation (ST) is an enabling technology for cross-lingual oral communication. A ST system consists of two major components: an automatic speech recognizer (ASR) and a ma...
The 2010 NIST Speaker Recognition Evaluation (SRE10) included a test of Human Assisted Speaker Recognition (HASR) in which systems based in whole or in part on human expertise wer...
Craig S. Greenberg, Alvin F. Martin, George R. Dod...
Abstract— A hybrid model was developed to predict the zeroquantized discrete cosine transform (ZQDCT) coefficients for intra blocks in our previous work. However, the complicated...
Jin Li, Weiwei Chen, Moncef Gabbouj, Jarmo Takala,...
—In this paper, we study the capacity-achieving input covariance matrices for the jointly-correlated (or the Weichselberger) Rician fading multiple-input multiple-output (MIMO) a...
Chao-Kai Wen, Shi Jin, Kai-Kit Wong, Jung-Chieh Ch...
Functional magnetic resonance imaging (fMRI) is a popular tool for studying brain activity due to its non-invasiveness. Conventionally an expected response needs to be available f...
Sarah Lee, Fernando Zelaya, Yohan Samarasinghe, St...
In this work, we compare several known approaches for multilingual acoustic modeling for three languages, Dari, Farsi and Pashto, which are of recent geo-political interest. We de...
Arindam Mandal, Dimitra Vergyri, Murat Akbacak, Co...
One problem in concatenative speech synthesis is how to incorporate prosodic factors in the unit selection. Imposing a predicted prosodic target is error-prone and does not benefi...
We present a speech pre-processing scheme (SPPS) for robust speech recognition in the moving motorcycle environment. The SPPS is dynamically adapted during the run-time operation ...
We consider the task of under-determined reverberant audio source separation. We model the contribution of each source to all mixture channels in the time-frequency domain as a ze...