Soft frame margin estimation of Gaussian Mixture Models for speaker recognition with sparse training data

13 years 4 months ago

Download mirlab.org

—Discriminative Training (DT) methods for acoustic modeling, such as MMI, MCE, and SVM, have been proved effective in speaker recognition. In this paper we propose a DT method for GMM using soft frame margin estimation. Unlike other DT methods such as MMI or MCE, the soft frame margin estimation attempts to enhance the generalization capability of GMM to unseen data in case the mismatch exists between training data and unseen data. We deﬁne an objective function which integrates multi-class separation frame margin and loss function, both as functions of GMM likelihoods. We propose to optimize the objective function based on a convex optimization technique, semideﬁnite programming. As shown in our experimental results, the proposed soft frame margin discriminative training with semideﬁnite programming optimization (SFMESDP) is very effective for robust speaker model training when only limited amounts of training data are available.

Yan Yin, Qi Li

Real-time Traffic

Frame Margin | Frame Margin Estimation | ICASSP 2011 | Signal Processing | Soft Frame Margin |

claim paper

Post Info
More Details (n/a)

Added	20 Aug 2011
Updated	20 Aug 2011
Type	Journal
Year	2011
Where	ICASSP
Authors	Yan Yin, Qi Li

Comments (0)

Sciweavers

Soft frame margin estimation of Gaussian Mixture Models for speaker recognition with sparse training data

Frame Margin | Frame Margin Estimation | ICASSP 2011 | Signal Processing | Soft Frame Margin |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers