In this paper, we will propose a novel semi-automatic annotation scheme for video semantic classification. It is well known that the large gap between high-level semantics and low...
Typical tag recommendation systems for photos shared on social networks such as Flickr, use visual content analysis, collaborative filtering or personalization strategies to prod...
Neela Sawant, Ritendra Datta, Jia Li, James Ze Wan...
In this paper, we propose an approach to learning appearance models of moving objects directly from compressed video. The appearance of a moving object changes dynamically in vide...
We present MuSA.RT, Opus 1, a multimodal interactive system for music analysis and visualization using the Spiral Array model. Real-time MIDI input from a live performance is proc...
Acoustic event detection (AED) aims to identify both timestamps and types of multiple events and has been found to be very challenging. The cues for these events often times exist...
Po-Sen Huang, Xiaodan Zhuang, Mark Hasegawa-Johnso...