Audio Signal Representations for Indexing in the Transform Domain

13 years 7 months ago

Download perso.telecom-paristech.fr

Indexing audio signals directly in the transform domain can potentially save a significant amount of computation when working on a large database of signals stored in a lossy compression format, without having to fully decode the signals. Here, we show that the representations used in standard transform-based audio codecs (e.g. MDCT for AAC, or hybrid PQF/MDCT for MP3) have a sufficient time resolution for some rhythmic features, but a poor frequency resolution, which prevents their use in tonality-related applications. Alternatively, a recently developed audio codec based on a sparse multiscale MDCT transform has a good resolution both for timeand frequency-domain features. We show that this new audio codec allows efficient transform-domain audio indexing for 3 different applications, namely beat tracking, chord recognition and musical genre classification. We compare results obtained with this new audio codec and the two standard MP3 and AAC codecs, in terms of performance and comput...

Emmanuel Ravelli, Gaël Richard, Laurent Daude

Real-time Traffic

Audio Codec | Lossy Compression Format | Software Engineering | TASLP 2010 | Transform-based Audio Codecs |

claim paper

Post Info
More Details (n/a)

Added	21 May 2011
Updated	21 May 2011
Type	Journal
Year	2010
Where	TASLP
Authors	Emmanuel Ravelli, Gaël Richard, Laurent Daudet

Comments (0)

Sciweavers

Audio Signal Representations for Indexing in the Transform Domain

Audio Codec | Lossy Compression Format | Software Engineering | TASLP 2010 | Transform-based Audio Codecs |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers