Cover song detection is becoming a very hot research topic when plentiful personal music recordings or performance are released on the Internet. A nice cover song recognizer helps us group and detect cover songs to improve the searching experience. The traditional detection is to match two musical audio sequences by exhaustive pairwise comparisons. Different from the existing work, our aim is to generate a group of concatenated feature sets based on regression modeling and arrange them by indexing-based approximate techniques to avoid complicated audio sequence comparisons. We mainly focus on using Exact Locality Sensitive Mapping (ELSM) to join the concatenated feature sets and soft hash values. Similarity-invariance among audio sequence comparison is applied to define an optimal combination of several audio features. Soft hash values are pre-calculated to help locate searching range more accurately. Furthermore, we implement our algorithms in analyzing the real audio cover songs and...
Yi Yu, J. Stephen Downie, Fabian Mörchen, Lei