We present a method for video classification based on information in the soundtrack. Unlike previous approaches which describe the audio via statistics of mel-frequency cepstral ...
Courtenay V. Cotton, Daniel P. W. Ellis, Alexander...
Acoustic event detection (AED) aims to identify both timestamps and types of multiple events and has been found to be very challenging. The cues for these events often times exist...
Po-Sen Huang, Xiaodan Zhuang, Mark Hasegawa-Johnso...
This paper describes an audio-based video surveillance system which automatically detects anomalous audio events in a public square, such as screams or gunshots, and localizes the...
Giuseppe Valenzise, Luigi Gerosa, Marco Tagliasacc...
An increasing number of computer vision applications require on-line processing of data streams, preferably in real-time. This trend is fueled by the mainstream availability of low...
Speaker diarization is originally defined as the task of determining “who spoke when” given an audio track and no other prior knowledge of any kind. The following article sho...