Comparing and Combining Generative and Posterior Probability Models: Some Advances in Sentence Boundary Detection in Speech

15 years 1 months ago

Download www.icsi.berkeley.edu

We compare and contrast two different models for detecting sentence-like units in continuous speech. The first approach uses hidden Markov sequence models based on N-grams and maximum likelihood estimation, and employs model interpolation to combine different representations of the data. The second approach models the posterior probabilities of the target classes; it is discriminative and integrates multiple knowledge sources in the maximum entropy (maxent) framework. Both models combine lexical, syntactic, and prosodic information. We develop a technique for integrating pretrained probability models into the maxent framework, and show that this approach can improve on an HMM-based state-of-the-art system for the sentence-boundary detection task. An even more substantial improvement is obtained by combining the posterior probabilities of the two systems.

Yang Liu, Andreas Stolcke, Elizabeth Shriberg, Mar

Real-time Traffic

EMNLP 2004 | EMNLP 2007 | Markov Sequence Models | Maximum Likelihood Estimation | Posterior Probabilities |

claim paper

Post Info
More Details (n/a)

Added	30 Oct 2010
Updated	30 Oct 2010
Type	Conference
Year	2004
Where	EMNLP
Authors	Yang Liu, Andreas Stolcke, Elizabeth Shriberg, Mary P. Harper

Comments (0)

Sciweavers

Comparing and Combining Generative and Posterior Probability Models: Some Advances in Sentence Boundary Detection in Speech

EMNLP 2004 | EMNLP 2007 | Markov Sequence Models | Maximum Likelihood Estimation | Posterior Probabilities |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers