Sciweavers

ICASSP
2011
IEEE

Soft frame margin estimation of Gaussian Mixture Models for speaker recognition with sparse training data

13 years 4 months ago
Soft frame margin estimation of Gaussian Mixture Models for speaker recognition with sparse training data
—Discriminative Training (DT) methods for acoustic modeling, such as MMI, MCE, and SVM, have been proved effective in speaker recognition. In this paper we propose a DT method for GMM using soft frame margin estimation. Unlike other DT methods such as MMI or MCE, the soft frame margin estimation attempts to enhance the generalization capability of GMM to unseen data in case the mismatch exists between training data and unseen data. We define an objective function which integrates multi-class separation frame margin and loss function, both as functions of GMM likelihoods. We propose to optimize the objective function based on a convex optimization technique, semidefinite programming. As shown in our experimental results, the proposed soft frame margin discriminative training with semidefinite programming optimization (SFMESDP) is very effective for robust speaker model training when only limited amounts of training data are available.
Yan Yin, Qi Li
Added 20 Aug 2011
Updated 20 Aug 2011
Type Journal
Year 2011
Where ICASSP
Authors Yan Yin, Qi Li
Comments (0)