In this work, we show how expectation maximization based simultaneous channel and noise estimation can be derived without a vector Taylor series expansion. The central idea is to approximate the distribution of all the random variables involved – that is noisy speech, clean speech, channel and noise – as one large, joint Gaussian distribution. Consequently, instantaneous estimates of the noise and channel distribution parameters can be obtained by conditioning the joint distribution on observed, noisy speech spectra. This approach allows for the combination of expectation maximization based channel and noise estimation with the unscented transform.
Friedrich Faubel, John W. McDonough, Dietrich Klak