One of the challenges the service providers currently face is the ability to introduce a variety of services at a minimal cost and impact to their customers. These services often ...
This paper presents a multi-modal approach to locate a speaker in a scene and determine to whom he or she is speaking. We present a simple probabilistic framework that combines mu...
Michael Siracusa, Louis-Philippe Morency, Kevin Wi...
Videos of multi-player team sports provide a challenging domain for dynamic scene analysis. Player actions and interactions are complex as they are driven by many factors, such as...
Kihwan Kim, Matthias Grundmann, Ariel Shamir, Iain...
While existing studies on YouTube’s massive user-generated video content have mostly focused on the analysis of videos, their characteristics, and network properties, little att...
The ability to detect and track human heads and faces in video sequences is useful in a great number of applications, such as human-computer interaction and gesture recognition. Re...