Fast speaker adaptation using non-negative matrix factorization

14 years 6 months ago

Download www.esat.kuleuven.be

This paper describes a new method for fast speaker adaptation in large vocabulary recognition systems. As in most HMM-based recognizers, the observation densities are modeled as a weighted sum of Gaussian densities. Instead of adapting the means of the Gaussian densities, which is typically done, the weights for the Gaussian densities in the states are adapted. By applying non-negative matrix factorization (NMF) in the proposed method, very fast adaptation was achieved. Experiments on the Wall Street Journal benchmark recognition task show relative improvements between 5% and 15%, while the adaptation converges within 0.2 seconds. Analysis of the latent speakers found by NMF learns that these latent speakers reﬂect the gender of the speaker most prominently, even when vocal tract length normalization is used, and that they reﬂect the speaker’s age more clearly than the speaker’s regional inﬂuences or dialect.

Jacques Duchateau, Tobias Leroy, Kris Demuynck, Hu

Real-time Traffic

Fast Speaker Adaptation | Gaussian Densities | ICASSP 2008 | Observation Densities | Signal Processing |

claim paper

Post Info
More Details (n/a)

Added	30 May 2010
Updated	30 May 2010
Type	Conference
Year	2008
Where	ICASSP
Authors	Jacques Duchateau, Tobias Leroy, Kris Demuynck, Hugo Van Hamme

Comments (0)

Sciweavers

Fast speaker adaptation using non-negative matrix factorization

Fast Speaker Adaptation | Gaussian Densities | ICASSP 2008 | Observation Densities | Signal Processing |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers