Robust data retrieval in the presence of uncertainty is a challenging problem in multimedia information retrieval. In query-by-humming (QBH) systems, uncertainty can arise in query formulation due to user-dependent variability, such as incorrectly hummed notes, and in query transcription due to machine-based errors, such as insertions and deletions. We propose a fingerprinting (FP) algorithm for representing salient melodic information so as to better compare potentially noisy voice queries with target melodies in a database. The FP technique is employed in the QBH system back end; a hidden Markov model (HMM) front end segments and transcribes the hummed audio input into a symbolic representation. The performance of the FP search algorithm is compared to the conventional edit distance (ED) technique. Our retrieval database is built on 1500 MIDI files and evaluated using 400 hummed samples from 80 people with different musical backgrounds. A melody retrieval accuracy of 88% is demonstra...
Erdem Unal, Elaine Chew, Panayiotis G. Georgiou, S