A scale-free distribution of false positives for a large class of audio similarity measures

14 years 2 months ago

Download www.jj-aucouturier.info

The "bag-of-frames" approach (BOF) to audio pattern recognition models signals as the long-term statistical distribution of their local spectral features, a prototypical implementation of which being Gaussian Mixture Models of Mel-Frequency Cepstrum Coefficients. This approach is the most predominant paradigm to extract high-level descriptions from music signals, such as their instrument, genre or mood, and can also be used to compute direct timbre similarity between songs. However, a recent study by the authors shows that this class of algorithms when applied to music tends to create false positives which are mostly always the same songs regardless of the query. In other words, with such models, there exist songs--which we call hubs--which are irrelevantly close to very many songs. This paper reports on a number of experiments, using implementations on large music databases, aiming at better understanding the nature and causes of such hub songs. We introduce two measures of...

Jean-Julien Aucouturier, François Pachet

Real-time Traffic

Music | Pattern Recognition | Pattern Recognition Models | PR 2008 |

claim paper

Post Info
More Details (n/a)

Added	14 Dec 2010
Updated	14 Dec 2010
Type	Journal
Year	2008
Where	PR
Authors	Jean-Julien Aucouturier, François Pachet

Comments (0)

Sciweavers

A scale-free distribution of false positives for a large class of audio similarity measures

Music | Pattern Recognition | Pattern Recognition Models | PR 2008 |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers