Music information retrieval is becoming very important with the ever-increasing growth of music content in digital libraries, peer-to-peer systems and the internet. While it is easy to quantize music into a discrete string representation, retrieval by content requires (approximate) sub-string matching, which is hard. In this paper, we present a novel system, called MUSIG, that uses compact MUsic SIGnatures for efficient contentbased music retrieval. The signature is computed as follows: (a) each music file is split into a set of (overlapping) segments; (b) similar segments are clustered together; the number of clusters corresponds to the number of dimensions; (c) for each music file, the number of its segments that fall into a cluster determines the key value in that dimension. Most index structures for multimedia are only able to provide an initial filtering and return a set of candidate answers that must be further examined. For MUSIG, we have also designed a scoring function that p...
Bin Cui, H. V. Jagadish, Beng Chin Ooi, Kian-Lee T