We investigate the connection between part of speech (POS) distribution and content in language. We define POS blocks to be groups of parts of speech. We hypothesise that there ex...
This paper considers a method for speech emotion recognition by a max-margin framework incorporating a loss function based on a well-known model called the Watson and Tellegen’s...
We present an Audio-visual person authentication system which extracts several novel "VisualizedSpeech-Features" (VSF) from the spoken-password and multiple face profile...
This paper presents a new approach to language model construction, learning a language model not from text, but directly from continuous speech. A phoneme lattice is created using...
Graham Neubig, Masato Mimura, Shinsuke Mori, Tatsu...
—This paper presents a visual speech synthesizer providing midsagittal and front views of the vocal tract to help language learners to correct their mispronunciations. We adopt ...