The number of video clips available online is growing at a tremendous pace. Conventionally, user-supplied metadata text, such as the title of the video and a set of keywords, has ...
Mehmet Emre Sargin, Hrishikesh Aradhye, Pedro J. M...
The Degenerate Unmixing Estimation Technique (DUET) is a Blind Source Separation (BSS) algorithm for stereo audio. DUET depends on an amplitude-phase 2d histogram built from the d...
This paper describes recent advances at LIMSI in Mandarin Chinese speech-to-text transcription. A number of novel approaches were introduced in the different system components. Th...
Lori Lamel, Jean-Luc Gauvain, Viet-Bac Le, Ilya Op...
The method which is called the “tandem approach” in speech recognition has been shown to increase performance by using classifier posterior probabilities as observations in a...
In emotion recognition, a widely-used method to reconciliate disagreement between multiple human evaluators is to perform majority-voting on their assigned class labels. Instead, ...