The number of video clips available online is growing at a tremendous pace. Conventionally, user-supplied metadata text, such as the title of the video and a set of keywords, has ...
Mehmet Emre Sargin, Hrishikesh Aradhye, Pedro J. M...
This paper describes recent advances at LIMSI in Mandarin Chinese speech-to-text transcription. A number of novel approaches were introduced in the different system components. Th...
Lori Lamel, Jean-Luc Gauvain, Viet-Bac Le, Ilya Op...
The acceleration of acoustic likelihood calculation has been an important research issue for developing practical speech recognition systems. And there are various specification ...
This paper describes an animated conversational agent called Kare1 which integrates a talking head interface with a linguistically motivated human-machine dialogue system. The age...
We describe the MusicMiner system for organizing large collections of music with databionic mining techniques. Low level audio features are extracted from the raw audio data on sh...