In this paper, we investigate the noise robustness of Wang and Shamma's early auditory (EA) model for the calculation of an auditory spectrum in audio classification applications. First, a stochastic analysis is conducted wherein an approximate expression of the auditory spectrum is derived to justify the noise-suppression property of the EA model. Second, we present an efficient fast Fourier transform (FFT)-based implementation for the calculation of a noise-robust auditory spectrum, which allows flexibility in the extraction of audio features. To evaluate the performance of the proposed FFT-based auditory spectrum, a set of speech/ music/noise classification tasks is carried out wherein a support vector machine (SVM) algorithm and a decision tree learning algorithm (C4.5) are used as the classifiers. Features used for classification include conventional Mel-frequency cepstral coefficients (MFCCs), MFCC-like features obtained from the original auditory spectrum (i.e., based on th...
Wei Chu, B. Champagne