A conventional automatic speech recognizer does not perform well in the presence of multiple sound sources, while human listeners are able to segregate and recognize a signal of i...
Yang Shao, Soundararajan Srinivasan, Zhaozhang Jin...
When exposed to environmental noise, speakers adjust their speech production to maintain intelligible communication. This phenomenon, called Lombard effect (LE), is known to consi...
Data sparseness is an ever dominating problem in automatic emotion recognition. Using artificially generated speech for training or adapting models could potentially ease this: t...
A conditional model is introduced for triggering understanding actions that correct errors of frame hypothesization and composition. Experimental evidence is provided using the Fr...
Written documents created through dictation differ significantly from a true verbatim transcript of the recorded speech. This poses an obstacle in automatic dictation systems as s...
Maximilian Bisani, Paul Vozila, Olivier Divay, Jef...