Maximizing global entropy reduction for active learning in speech recognition

15 years 9 months ago

Download research.microsoft.com

We propose a new active learning algorithm to address the problem of selecting a limited subset of utterances for transcribing from a large amount of unlabeled utterances so that the accuracy of the automatic speech recognition system can be maximized. Our algorithm differentiates itself from earlier work in that it uses a criterion that maximizes the lattice entropy reduction over the whole dataset. We introduce our criterion, show how it can be simpliﬁed and approximated, and describe the detailed algorithm to optimize the criterion. We demonstrate the effectiveness of our new algorithm with directory assistance data collected under the real usage scenarios and show that our new algorithm consistently outperforms the conﬁdence based approach by a signiﬁcant margin. Using the algorithm cuts the number of utterances needed for transcribing by 50% to achieve the same recognition accuracy obtained using the conﬁdence-based approach, and by 60% compared to the random sampling app...

Balakrishnan Varadarajan, Dong Yu, Li Deng, Alex A

Real-time Traffic

Active Learning Algorithm | Algorithm | Detailed Algorithm | ICASSP 2009 | Signal Processing |

claim paper

Added	21 May 2010
Updated	21 May 2010
Type	Conference
Year	2009
Where	ICASSP
Authors	Balakrishnan Varadarajan, Dong Yu, Li Deng, Alex Acero

Sciweavers

Maximizing global entropy reduction for active learning in speech recognition

Active Learning Algorithm | Algorithm | Detailed Algorithm | ICASSP 2009 | Signal Processing |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers