We manually designed rules for a backchannel (BC) prediction model based on pitch and pause information. In short, the model predicts a BC when there is a pause of a certain lengt...
Most paralinguistic analysis tasks are lacking agreed-upon evaluation procedures and comparability, in contrast to more `traditional' disciplines in speech analysis. The INTE...
This paper presents a novel automatic speaker age and gender identification approach which combines five different methods at the acoustic level to improve the baseline performanc...
This paper presents a comparison between a hidden Markov model (HMM) based method and a novel artificial neural network (ANN) based method for lip synchronisation. Both model type...
Text-to-speech synthesizer systems are of overall good quality, especially when adapted to a specific task. Given this task and an adapted voice corpus, the message quality is mai...
This paper discusses a set of modifications regarding the use of the Bayesian Information Criterion (BIC) for the speaker diarization task. We focus on the specific variant of the...
This paper describes the AuToBI tool for automatic generation of hypothesized ToBI labels. While research on automatic prosodic annotation has been conducted for many years, AuToB...
This paper presents our latest efforts toward LVCSR systems for five Eastern European languages such as Bulgarian, Croatian, Czech, Polish, and Russian using our Rapid Language Ad...
Ngoc Thang Vu, Tim Schlippe, Franziska Kraus, Tanj...
In this paper, we propose a joint optimal method for automatic speech recognition (ASR) and ideal binary mask (IBM) estimation in transformed into the cepstral domain through a ne...
Lae-Hoon Kim, Kyung-Tae Kim, Mark Hasegawa-Johnson
This paper presents a study on the acoustic sub-segmental properties of word-final /t/ in conversational standard Dutch and how these properties contribute to whether humans and a...
Barbara Schuppler, Mirjam Ernestus, Wim A. van Dom...