We investigate the challenging issue of joint audio-visual analysis of generic videos targeting at semantic concept detection. We propose to extract a novel representation, the Sh...
Wei Jiang, Courtenay V. Cotton, Shih-Fu Chang, Dan...
This paper presents an automatic Australian sign language (Auslan) recognition system, which tracks multiple target objects (the face and hands) throughout an image sequence and e...
We introduce the Longterm Observation of Scenes (with Tracks) dataset. This dataset comprises videos taken from streaming outdoor webcams, capturing the same half hour, each day, ...
Austin Abrams, Jim Tucek, Joshua Little, Nathan Ja...
Technology in the field of digital media generates huge amounts of non-textual information, audio, video, and images, along with more familiar textual information. The potential f...
We propose a new framework for multi-object segmentation of deep brain structures, which have significant shape variations and relatively small sizes in medical brain images. In th...