Approaching human listener accuracy with modern speaker verification

13 years 10 months ago

Download cs.joensuu.fi

Being able to recognize people from their voice is a natural ability that we take for granted. Recent advances have shown significant improvement in automatic speaker recognition performance. Besides being able to process large amount of data in a fraction of time required by human, automatic systems are now able to deal with diverse channel effects. The goal of this paper is to examine how state-of-the-art automatic system performs in comparison with human listeners, and to investigate the strategy for human-assisted form of automatic speaker recognition, which is useful in forensic investigation. We set up an experimental protocol using data from the NIST SRE 2008 core set. A total of 36 listeners have participated in the listening experiments from three sites, namely Australia, Finland and Singapore. State-of-the-art automatic system achieved 20% error rate, whereas fusion of human listeners achieved 22%.

Ville Hautamäki, Tomi Kinnunen, Mohaddeseh No

Real-time Traffic

Automatic Speaker Recognition | Human Listeners | INTERSPEECH 2010 | Signal Processing | Speaker Recognition Performance |

claim paper

Post Info
More Details (n/a)

Added	18 May 2011
Updated	18 May 2011
Type	Journal
Year	2010
Where	INTERSPEECH
Authors	Ville Hautamäki, Tomi Kinnunen, Mohaddeseh Nosratighods, Kong-Aik Lee, Bin Ma, Haizhou Li

Comments (0)

Sciweavers

Approaching human listener accuracy with modern speaker verification

Automatic Speaker Recognition | Human Listeners | INTERSPEECH 2010 | Signal Processing | Speaker Recognition Performance |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers