A comparison of approaches for modeling prosodic features in speaker recognition

13 years 11 months ago

Download www.icsi.berkeley.edu

Prosodic information has been successfully used for speaker recognition for more than a decade. The best-performing prosodic system to date has been one based on features extracted over syllables obtained automatically from speech recognition output. The features are then transformed using a Fisher kernel, and speaker models are trained using support vector machines (SVMs). Recently, a simpler version of these features, based on pseudo-syllables was shown to perform well when modeled using joint factor analysis (JFA). In this work, we study the two modeling techniques for the simpler set of features. We show that, for these features, a combination of JFA systems for different sequence lengths greatly outperforms both original modeling methods. Furthermore, we show that the combination of both methods gives signiﬁcant improvements over the best single system. Overall, a performance improvement of 30% in the detection cost function (DCF) with respect to the two previously published me...

Luciana Ferrer, Nicolas Scheffer, Elizabeth Shribe

Real-time Traffic

ICASSP 2010 | Joint Factor Analysis | Prosodic Information | Signal Processing | Support Vector Machines |

claim paper

Post Info
More Details (n/a)

Added	25 Jan 2011
Updated	25 Jan 2011
Type	Journal
Year	2010
Where	ICASSP
Authors	Luciana Ferrer, Nicolas Scheffer, Elizabeth Shriberg

Comments (0)

Sciweavers

A comparison of approaches for modeling prosodic features in speaker recognition

ICASSP 2010 | Joint Factor Analysis | Prosodic Information | Signal Processing | Support Vector Machines |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers