Sciweavers

INTERSPEECH
2010
13 years 6 months ago
Measuring basic tempo across languages and some implications for speech rhythm
Basic language-inherent tempo cannot be isolated by the current metrics of speech rhythm. Here we propose the number of syllables per intonation unit as an appropriate measure, al...
Gertraud Fenk-Oczlon, August Fenk
INTERSPEECH
2010
13 years 6 months ago
Glottal-based analysis of the lombard effect
The Lombard effect refers to the speech changes due to the immersion of the speaker in a noisy environment. Among these changes, studies have already reported acoustic modificatio...
Thomas Drugman, Thierry Dutoit
INTERSPEECH
2010
13 years 6 months ago
An improved wavelet-based dereverberation for robust automatic speech recognition
This paper presents an improved wavelet-based dereverberation method for automatic speech recognition (ASR). Dereverberation is based on filtering reverberant wavelet coefficients...
Randy Gomez, Tatsuya Kawahara
INTERSPEECH
2010
13 years 6 months ago
Discriminative adaptation for log-linear acoustic models
Log-linear models have recently been used in acoustic modeling for speech recognition systems. This has been motivated by competitive results compared to systems based on Gaussian...
Jonas Lööf, Ralf Schlüter, Hermann ...
INTERSPEECH
2010
13 years 6 months ago
Pitch similarity in the vicinity of backchannels
Dynamic modeling of spoken dialogue seeks to capture how interlocutors change their speech over the course of a conversation. Much work has focused on how speakers adapt or entrai...
Mattias Heldner, Jens Edlund, Julia Hirschberg
INTERSPEECH
2010
13 years 6 months ago
Learning from human errors: prediction of phoneme confusions based on modified ASR training
In an attempt to improve models of human perception, the recognition of phonemes in nonsense utterances was predicted with automatic speech recognition (ASR) in order to analyze i...
Bernd T. Meyer, Birger Kollmeier
INTERSPEECH
2010
13 years 6 months ago
Expectations for discourse genre identification: a prosodic study
Speech can be divided into discourse genres based on the contextual environment it occurs in (e.g. political speech, sport commentary speech, etc.). The present study investigated...
Nicolas Obin, Volker Dellwo, Anne Lacheret, Xavier...
INTERSPEECH
2010
13 years 6 months ago
Improved neural network based language modelling and adaptation
Neural network language models (NNLM) have become an increasingly popular choice for large vocabulary continuous speech recognition (LVCSR) tasks, due to their inherent generalisa...
Junho Park, Xunying Liu, Mark J. F. Gales, Philip ...
INTERSPEECH
2010
13 years 6 months ago
Building transcribed speech corpora quickly and cheaply for many languages
We present a system for quickly and cheaply building transcribed speech corpora containing utterances from many speakers in a variety of acoustic conditions. The system consists o...
Thad Hughes, Kaisuke Nakajima, Linne Ha, Atul Vasu...
INTERSPEECH
2010
13 years 6 months ago
Distribution and trichotomic realization of voiced velars in Japanese - an experimental study
In this paper, we demonstrate the trichotomic realization of voiced velars in Japanese, challenging the traditional plosive/nasal dichotomy of velar allophones, and examine the di...
Shin-ichiro Sano, Tomohiko Ooigawa