This paper considers a method for learning a distance metric in a fingerprinting system which identifies a query content by measuring the distance between the fingerprint of the query content and a fingerprint stored in a database. A metric having a general form of the Mahalanobis distance is learned with the goal that the distance between fingerprints extracted from perceptually similar contents should be smaller than the distance between fingerprints extracted from perceptually dissimilar contents. The metric is learned by minimizing a cost function designed to achieve the goal. The cost function is convex, and the global minimum can be obtained using convex optimization. In our experiment, the distance metric learning is applied in an audio fingerprinting system, and it is experimentally shown that the learned distance metric improves the identification performance.
Dalwon Jang, Chang D. Yoo