We propose a multimodal speaker segmentation algorithm with two main contributions: First, we suggest a hidden Markov model architecture that performs fusion of the three modaliti...
Viktor Rozgic, Kyu Jeong Han, Panayiotis G. Georgi...
Automatic grouping and segmentation of images remains a challenging problem in computer vision. Recently, a number of authors have demonstrated good performance on this task using...
Music information retrieval (MIR) holds great promise as a technology for managing large music archives. One of the key components of MIR that has been actively researched into is...
Jialie Shen, Wang Meng, Shuichang Yan, HweeHwa Pan...
Many perception and multimedia indexing problems involve datasets that are naturally comprised of multiple streams or modalities for which supervised training data is only sparsely...
Ashish Kapoor, Chris Mario Christoudias, Raquel Ur...
As computer and database technologies advance rapidly, biologists all over the world can share biologically meaningful data from images of specimens and use the data to classify th...