Audio-Visual Speech Recognition (AVSR) uses vision to enhance speech recognition but also introduces the problem of how to join (or fuse) these two signals together. Mainstream re...
A self-organizing neural network for sequence classification called SARDNET is described and analyzed experimentally. SARDNET extends the Kohonen Feature Map architecture with act...
Abstract. Systems for keyword and non-linguistic vocalization detection in conversational agent applications need to be robust with respect to background noise and different speak...
We propose a novel multi-stream framework for continuous conversational speech recognition which employs bidirectional Long Short-Term Memory (BLSTM) networks for phoneme predicti...
This paper presents a new method for the verification of the correct pronunciation of spoken words. This process is based on speech recognition technology. It can be particularly ...