In spoken dialogues, if a spoken dialogue system does not respond at all during user's utterances, the user might feel uneasy because the user does not know whether or not th...
Abstract. We apply Long Short-Term Memory (LSTM) recurrent neural networks to a large corpus of unprompted speech- the German part of the VERBMOBIL corpus. Training first on a fra...
Nicole Beringer, Alex Graves, Florian Schiel, J&uu...
The output of a speech recognition system is not always ideal for subsequent downstream processing, in part because speakers themselves often make mistakes. A system would accompl...
Due to upcoming mobile telephony services with higher speech quality, a wideband (50 Hz to 7 kHz) mobile telephony derivative of TIMIT has been recorded called WTIMIT. It allows a...
We present an overview of the data collection and transcription efforts for the COnversational Speech In Noisy Environments (COSINE) corpus. The corpus is a set of multi-party con...
Alex Stupakov, Evan Hanusa, Jeff A. Bilmes, Dieter...