We propose a multimodal speaker segmentation algorithm with two main contributions: First, we suggest a hidden Markov model architecture that performs fusion of the three modaliti...
Viktor Rozgic, Kyu Jeong Han, Panayiotis G. Georgi...
In spoken dialogue systems, it is important for a system to know how likely a speech recognition hypothesis is to be correct, so it can reprompt for fresh input, or, in cases wher...
Fuzzy rule base systems have been successfully used for pattern classification. These systems focus on generating a rule-base from numerical input data. The resulting rule-base ca...
This paper describes an audio retrieval system,Quebex,that works on raw audio data. The system is able to retrieve songs that are rhythmically and timbrewise similar from a databa...
This paper studies the effect of Latent Semantic Analysis (LSA) on two different tasks: multimedia document retrieval (MDR) and automatic image annotation (AIA). The contributio...
Trong-Ton Pham, Nicolas Maillot, Joo-Hwee Lim, Jea...