In this paper we address the problem of estimating who is speaking from automatically extracted low resolution visual cues in group meetings. Traditionally, the task of speech/non...
In this paper we explore the relationship between the temporal and rhythmic structure of musical audio signals. Using automatically extracted rhythmic structure we present a rhyth...
Norberto Degara, Aantonio Pena, Matthew E. P. Davi...
Removing commercials from television programs is a much sought-after feature for a personal video recorder. In this paper, we employ an unsupervised clustering scheme (CM Detect) ...
King-Shy Goh, Koji Miyahara, Regunathan Radhakrish...
We propose a new approach for video event learning. The only hypothesis is the availability of tracked object attributes. The approach incrementally aggregates the attributes and r...
We present a method that automatically detects chewing events in surveillance video of a subject. Firstly, an Active Appearance Model (AAM) is used to track a subject’s face acr...