Rapid Speaker Adaptation Using Clustered Maximum-Likelihood Linear Basis With Sparse Training Data

15 years 2 months ago

Download www.ece.mcgill.ca

Abstract-- Speaker space based adaptation methods for automatic speech recognition have been shown to provide significant performance improvements for tasks where only a few seconds of adaptation speech is available. However, these techniques are not widely used in practical applications because they require large amounts of speaker dependent training data and large amounts of computer memory. The authors propose a robust, low complexity technique within this general class that has been shown to reduce word error rate, reduce the large storage requirements associated with speaker space approaches, and eliminate the need for large numbers of utterances per speaker in training. The technique is based on representing speakers as a linear combination of clustered linear basis vectors and a procedure is presented for maximum likelihood estimation of these vectors from training data. Significant word error rate reduction was obtained using these methods relative to speaker independent perfor...

Yun Tang, Richard Rose

Real-time Traffic

Large Amounts | Speaker Space | TASLP 2008 | Word Error Rate |

claim paper

Added	15 Dec 2010
Updated	15 Dec 2010
Type	Journal
Year	2008
Where	TASLP
Authors	Yun Tang, Richard Rose

Sciweavers

Rapid Speaker Adaptation Using Clustered Maximum-Likelihood Linear Basis With Sparse Training Data

Large Amounts | Speaker Space | TASLP 2008 | Word Error Rate |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers