High-level spoken document analysis is required in many applications seeking access to the semantic content of audio data, such as information retrieval, machine translation or au...
Julien Fayolle, Fabienne Moreau, Christian Raymond...
Speech recognition has become common in many application domains, from dictation systems for professional practices to vocal user interfaces for people with disabilities or hands-...
Sabato Marco Siniscalchi, Fulvio Gennaro, Salvator...
In this paper, we propose a joint optimal method for automatic speech recognition (ASR) and ideal binary mask (IBM) estimation in transformed into the cepstral domain through a ne...
Lae-Hoon Kim, Kyung-Tae Kim, Mark Hasegawa-Johnson
Computer Assisted Language Learning (CALL) applications for improving the oral skills of low-proficient learners have to cope with nonnative speech that is particularly challengin...
Joost van Doremalen, Catia Cucchiarini, Helmer Str...
The use of visual information derived from accurate lip extraction, can provide features invariant to noise perturbation for speech recognition systems and can be also used in a w...