Sciweavers

ICASSP
2008
IEEE

Fast speaker adaptation using non-negative matrix factorization

14 years 6 months ago
Fast speaker adaptation using non-negative matrix factorization
This paper describes a new method for fast speaker adaptation in large vocabulary recognition systems. As in most HMM-based recognizers, the observation densities are modeled as a weighted sum of Gaussian densities. Instead of adapting the means of the Gaussian densities, which is typically done, the weights for the Gaussian densities in the states are adapted. By applying non-negative matrix factorization (NMF) in the proposed method, very fast adaptation was achieved. Experiments on the Wall Street Journal benchmark recognition task show relative improvements between 5% and 15%, while the adaptation converges within 0.2 seconds. Analysis of the latent speakers found by NMF learns that these latent speakers reflect the gender of the speaker most prominently, even when vocal tract length normalization is used, and that they reflect the speaker’s age more clearly than the speaker’s regional influences or dialect.
Jacques Duchateau, Tobias Leroy, Kris Demuynck, Hu
Added 30 May 2010
Updated 30 May 2010
Type Conference
Year 2008
Where ICASSP
Authors Jacques Duchateau, Tobias Leroy, Kris Demuynck, Hugo Van Hamme
Comments (0)