3D scene understanding is key for the success of applications such as autonomous driving and robot navigation. However, existing approaches either produce a mild level of understa...
—Inexpensive RGB-D cameras that give an RGB image together with depth data have become widely available. We use this data to build 3D point clouds of a full scene. In this paper,...
Hema Swetha Koppula, Abhishek Anand, Thorsten Joac...
In this paper, an approach for polyphonic music transcription based on joint multiple-F0 estimation and note onset/offset detection is proposed. For preprocessing, the resonator t...
The use of visual information derived from accurate lip extraction, can provide features invariant to noise perturbation for speech recognition systems and can be also used in a w...
We propose a novel approach to understanding
activities from their partial observations monitored through
multiple non-overlapping cameras separated by unknown time
gaps. In our...