Some recent dereverberation approaches that have been effective for automatic speech recognition (ASR) applications, model reverberation as a linear convolution operation in the spectral domain, and derive a factorization to decompose spectra of reverberated speech in to those of clean speech and room-response filter. Typically, a general non-negative matrix factorization (NMF) framework is employed for this. In this work1 we present an alternative to NMF and propose an iterative least-squares deconvolution technique for spectral factorization. We propose an efficient algorithm for this and experimentally demonstrate it’s effectiveness in improving ASR performance. The new method results in 40-50% relative reduction in word error rates over standard baselines on artificially reverberated speech.
Kshitiz Kumar, Bhiksha Raj, Rita Singh, Richard M.