Audio identification based on spectral modeling of bark-bands energy and synchronization through onset detection

14 years 4 months ago

Download articles.ircam.fr

In this paper, we present for the ﬁrst time the ﬁngerprint IRCAM system for audio identiﬁcation in streams. The baseline system relies on a double-nested Short Time Fourier Transform. The ﬁrst STFT computes the energies of a ﬁlter-bank, that are then modelled over 2 s, using a second STFT. We then present recent improvements of our system: ﬁrst the inclusion of perceptual scales for amplitude and frequency (Bark bands), then the synchronization of stream and database frames using an onset detection system. The performance of these improvements is tested on a large set of real audio streams. We compare our results with the results of re-implementations of the two state-of-the-art systems of Philips and Shazam.

Mathieu Ramona, Geoffroy Peeters

Real-time Traffic

ICASSP 2011 | Signal Processing | Time Fourier Transform | ﬁrst Stft | ﬁrst Time |

claim paper

Post Info
More Details (n/a)

Added	21 Aug 2011
Updated	21 Aug 2011
Type	Journal
Year	2011
Where	ICASSP
Authors	Mathieu Ramona, Geoffroy Peeters

Comments (0)

Sciweavers

Audio identification based on spectral modeling of bark-bands energy and synchronization through onset detection

ICASSP 2011 | Signal Processing | Time Fourier Transform | ﬁrst Stft | ﬁrst Time |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers