Unsupervised content discovery in composite audio

16 years 18 days ago

Download research.microsoft.com

Automatically extracting semantic content from audio streams can be helpful in many multimedia applications. Motivated by the known limitations of traditional supervised approaches to content extraction, which are hard to generalize and require suitable training data, we propose in this paper an unsupervised approach to discover and categorize semantic content in a composite audio stream. In our approach, we first employ spectral clustering to discover natural semantic sound clusters in the analyzed data stream (e.g. speech, music, noise, applause, speech mixed with music, etc.). These clusters are referred to as audio elements. Based on the obtained set of audio elements, the key audio elements, which are most prominent in characterizing the content of input audio data, are selected and used to detect potential boundaries of semantic audio segments denoted as auditory scenes. Finally, the auditory scenes are categorized in terms of the audio elements appearing therein. Categorization...

Rui Cai, Lie Lu, Alan Hanjalic

Real-time Traffic

Audio Element | Auditory Scene | Key Audio Element | MM 2005 |

claim paper

Post Info
More Details (n/a)

Added	26 Jun 2010
Updated	26 Jun 2010
Type	Conference
Year	2005
Where	MM
Authors	Rui Cai, Lie Lu, Alan Hanjalic

Comments (0)

Sciweavers

Unsupervised content discovery in composite audio

Audio Element | Auditory Scene | Key Audio Element | MM 2005 |

Explore & Download

Productivity Tools

Sciweavers