Sciweavers

INTERSPEECH
2010
13 years 6 months ago
Recurrent neural network based language model
A new recurrent neural network based language model (RNN LM) with applications to speech recognition is presented. Results indicate that it is possible to obtain around 50% reduct...
Tomas Mikolov, Martin Karafiát, Lukas Burge...
INTERSPEECH
2010
13 years 6 months ago
Landmark-based automated pronunciation error detection
We present a pronunciation error detection method for second language learners of English (L2 learners). The method is a combination of confidence scoring and landmark-based Suppo...
Su-Youn Yoon, Mark Hasegawa-Johnson, Richard Sproa...
INTERSPEECH
2010
13 years 6 months ago
Prosodic speaker verification using subspace multinomial models with intersession compensation
We propose a novel approach to modeling prosodic features. Inspired by Joint Factor Analysis model (JFA), our model is based on the same idea of introducing subspace of model para...
Marcel Kockmann, Lukas Burget, Ondrej Glembek, Luc...
INTERSPEECH
2010
13 years 6 months ago
An implementation of decision tree-based context clustering on graphics processing units
Decision tree-based context clustering is the essential but timeconsuming part of building HMM-based speech synthesis systems. The widely used implementation of this technique is ...
Nicholas Pilkington, Heiga Zen
INTERSPEECH
2010
13 years 6 months ago
Automatic derivation of phonological rules for mispronunciation detection in a computer-assisted pronunciation training system
Computer-Assisted Pronunciation Training System (CAPT) has become an important learning aid in second language (L2) learning. Our approach to CAPT is based on the use of phonologi...
Wai Kit Lo, Shuang Zhang, Helen M. Meng
INTERSPEECH
2010
13 years 6 months ago
Direct observation of pruning errors (DOPE): a search analysis tool
The search for the optimal word sequence can be performed efficiently even in a speech recognizer with a very large vocabulary and complex models. This is achieved using pruning m...
Volker Steinbiss, Martin Sundermeyer, Hermann Ney
INTERSPEECH
2010
13 years 6 months ago
Estimating missing data sequences in x-ray microbeam recordings
Techniques for recording the vocal tract shape during speech such as X-ray microbeam or EMA track the spatial location of pellets attached to several articulators. Limitations of ...
Chao Qin, Miguel Á. Carreira-Perpiñ&...
INTERSPEECH
2010
13 years 6 months ago
Language specific effects of emotion on phoneme duration
This paper presents an analysis of phoneme durations of emotional speech in two languages: Dutch and Korean. The analyzed corpus of emotional speech has been specifically develope...
Martijn Goudbeek, Mirjam Broersma
INTERSPEECH
2010
13 years 6 months ago
Towards affective state modeling in narrative and conversational settings
We carry out two studies on affective state modeling for communication settings that involve unilateral intent on the part of one participant (the evoker) to shift the affective s...
Bart Jochems, Martha Larson, Roeland Ordelman, Ron...
INTERSPEECH
2010
13 years 6 months ago
The RWTH 2009 quaero ASR evaluation system for English and German
In this work, the RWTH automatic speech recognition systems for English and German for the second Quaero evaluation campaign 2009 are presented. The systems are designed to transc...
Markus Nußbaum-Thom, Simon Wiesler, Martin S...