Open-set speaker identification in broadcast news

14 years 5 months ago

Download mirlab.org

In this paper, we examine the problem of text-independent open-set speaker identification (OS-SI) in broadcast news. Particularly, the impact of the population of registered speakers to OS-SI performance is investigated, which is the central issue for designing practical OS-SI system. We amend the maximum mutual information (MMI)-based discriminative training scheme to facilitate its incorporation in OS-SI systems. We also improve the implementation to allow the application of MMIbased approach with 2048-component Gaussian mixture models. All systems are evaluated using NIST RT-03, RT-04 and FBIS corpora, with a maximum of 82 registered speakers. Our study shows that notable performance improvement can be obtained with MMI-based discriminative training, which reduces the equal error rate (EER) by 15.9% relatively, in comparison to the GMM-MAP scheme.

Chao Gao, Guruprasad Saikumar, Amit Srivastava, Pr

Real-time Traffic

Discriminative Training | Discriminative Training Scheme | ICASSP 2011 | MMI-based Discriminative Training | Signal Processing |

claim paper

» Automatic named identification of speakers using diarization and ASR systems

» Multimodal Speaker Identification Based on Text and Speech

» Metaclassification Combining Multimodal Classifiers

Post Info
More Details (n/a)

Added	21 Aug 2011
Updated	21 Aug 2011
Type	Journal
Year	2011
Where	ICASSP
Authors	Chao Gao, Guruprasad Saikumar, Amit Srivastava, Premkumar Natarajan

Comments (0)

Sciweavers

Open-set speaker identification in broadcast news

Discriminative Training | Discriminative Training Scheme | ICASSP 2011 | MMI-based Discriminative Training | Signal Processing |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers