A new recurrent neural network based language model (RNN LM) with applications to speech recognition is presented. Results indicate that it is possible to obtain around 50% reduct...
We present a pronunciation error detection method for second language learners of English (L2 learners). The method is a combination of confidence scoring and landmark-based Suppo...
Su-Youn Yoon, Mark Hasegawa-Johnson, Richard Sproa...
We propose a novel approach to modeling prosodic features. Inspired by Joint Factor Analysis model (JFA), our model is based on the same idea of introducing subspace of model para...
Marcel Kockmann, Lukas Burget, Ondrej Glembek, Luc...
Decision tree-based context clustering is the essential but timeconsuming part of building HMM-based speech synthesis systems. The widely used implementation of this technique is ...
Computer-Assisted Pronunciation Training System (CAPT) has become an important learning aid in second language (L2) learning. Our approach to CAPT is based on the use of phonologi...
The search for the optimal word sequence can be performed efficiently even in a speech recognizer with a very large vocabulary and complex models. This is achieved using pruning m...
Techniques for recording the vocal tract shape during speech such as X-ray microbeam or EMA track the spatial location of pellets attached to several articulators. Limitations of ...
This paper presents an analysis of phoneme durations of emotional speech in two languages: Dutch and Korean. The analyzed corpus of emotional speech has been specifically develope...
We carry out two studies on affective state modeling for communication settings that involve unilateral intent on the part of one participant (the evoker) to shift the affective s...
In this work, the RWTH automatic speech recognition systems for English and German for the second Quaero evaluation campaign 2009 are presented. The systems are designed to transc...