Cross-Genre Feature Comparisons for Spoken Sentence Segmentation

16 years 25 days ago

Download www.icsi.berkeley.edu

Automatic sentence segmentation of spoken language is an important precursor to downstream natural language processing. Previous studies combine lexical and prosodic features, but can impose signiﬁcant computational challenges because of the large size of feature sets. Little is understood about which features most beneﬁt performance, particularly for speech data from different speaking styles. We compare sentence segmentation for speech from broadcast news versus natural multi-party meetings, using identical lexical and prosodic feature sets across genres. Results based on boosting and forward selection for this task show that (1) features sets can be reduced with little or no loss in performance, and (2) the contribution of different feature types differs signiﬁcantly by genre. We conclude that more efﬁcient approaches to sentence segmentation and similar tasks can be achieved, especially if genre differences are taken into account.

Sébastien Cuendet, Dilek Z. Hakkani-Tü

Real-time Traffic

Automatic Sentence Segmentation | Feature Sets | Semantic Computing | SEMCO 2007 | Sentence Segmentation |

claim paper

» Sentence segmentation and punctuation recovery for spoken language translation

» FeasPar A Feature Structure Parser Learning to Parse Spoken Language

» Twolevel speech recognition to enhance the performance of spoken dialogue systems

» Genre effects on automatic sentence segmentation of speech A comparison of broadcast news ...

» Simultaneous EnglishJapanese Spoken Language Translation Based on Incremental Dependency P...

» ReRanking Models for Spoken Language Understanding

» Training Global Linear Models for Chinese Word Segmentation

Post Info
More Details (n/a)

Added	04 Jun 2010
Updated	04 Jun 2010
Type	Conference
Year	2007
Where	SEMCO
Authors	Sébastien Cuendet, Dilek Z. Hakkani-Tür, Elizabeth Shriberg, James Fung, Benoît Favre

Comments (0)

Sciweavers

Cross-Genre Feature Comparisons for Spoken Sentence Segmentation

Automatic Sentence Segmentation | Feature Sets | Semantic Computing | SEMCO 2007 | Sentence Segmentation |

Explore & Download

Productivity Tools

Sciweavers