Discriminative duration modeling for speech recognition with segmental conditional random fields

13 years 8 months ago

Download www.stanford.edu

This paper describes a new approach to modeling duration for LVCSR using SCARF, a toolkit for speech recognition with segmental conditional random fields. We utilize SCARF’s ability to integrate long-span, segment-level features to design and test duration models that help discriminate between correct and incorrect word hypotheses. We show that the duration distributions of correct and incorrect word hypotheses differ. Given a word hypothesis in the lattice and its duration, conditional length probabilities are integrated to the SCARF system as duration features. We evaluate three kinds of duration features on Broadcast News: word, pre- and post-pausal durations, and word span confusions. Adding the duration features to SCARF results in an up to 0.3% improvement over a stateof-the-art discriminatively trained baseline of 15.3% WER on a Broadcast News task.

Justine T. Kao, Geoffrey Zweig, Patrick Nguyen

Real-time Traffic

Conditional Random Fields | ICASSP 2011 | Incorrect Word | Signal Processing | Test Duration Models |

claim paper

Post Info
More Details (n/a)

Added	20 Aug 2011
Updated	20 Aug 2011
Type	Journal
Year	2011
Where	ICASSP
Authors	Justine T. Kao, Geoffrey Zweig, Patrick Nguyen

Comments (0)

Sciweavers

Discriminative duration modeling for speech recognition with segmental conditional random fields

Conditional Random Fields | ICASSP 2011 | Incorrect Word | Signal Processing | Test Duration Models |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers