This paper presents an approach to multi-sensory and multi-modal fusion in which computer vision information obtained from calibrated cameras is integrated with a large-scale sent...
Human motion can be understood on many levels. The most basic level is the notion that humans are collections of things that have predictable visual appearance. Next is the notion...
Christopher Richard Wren, Brian P. Clarkson, Alex ...
In this paper, we investigate what can be inferred from several silhouette probability maps, in multi-camera environments. To this aim, we propose a new framework for multi-view s...
Many perception and multimedia indexing problems involve datasets that are naturally comprised of multiple streams or modalities for which supervised training data is only sparsely...
Ashish Kapoor, Chris Mario Christoudias, Raquel Ur...
Phoneme segmentation is a fundamental problem in many speech recognition and synthesis studies. Unsupervised phoneme segmentation assumes no knowledge on linguistic contents and a...