This paper proposes a content-based audio information retrieval and indexing technique based on wavelet transform. The presented approach uses multiresolution decomposition property of the discrete wavelet transform to analyze audio data. The wavelet decomposition of an audio signal highly resembles to its decomposition in sound octaves. A hierarchical indexing scheme is constructed using statistical properties of the wavelet coefficients such as zero-crossing rate, mean, and standard deviation at multiple scales. The performance of the proposed systems is experimentally evaluated using 418 different audio clips. The prototype system yields very high recall ratios (higher than 70%) for sample queries with diverse audio characteristics.
Guohui Li, Ashfaq A. Khokhar