Punctuating speech for information extraction

16 years 1 months ago

Download ssli.ee.washington.edu

This paper studies the effect of automatic sentence boundary detection and comma prediction on entity and relation extraction in speech. We show that punctuating the machine generated transcript according to maximum F-measure of period and comma annotation results in suboptimal information extraction. Precisely, period and comma decision thresholds can be chosen in order to improve the entity value score and the relation value score by 4% relative. Error analysis shows that preventing noun-phrase splitting by generating longer sentences and fewer commas can be harmful for IE performance. Indeed, it seems that missed punctuation allows syntactic parsers to merge noun-phrases and prevent the extraction of correct information.

Benoît Favre, Ralph Grishman, Dustin Hillard

Real-time Traffic

Comma Decision Thresholds | Comma Prediction | Commas | ICASSP 2008 | Signal Processing |

claim paper

» Towards a Syntactic Account of Punctuation

» ProsodyBased Automatic Segmentation of Speech into Sentences and Topics

» SentenceInternal Prosody Does not Help Parsing the Way Punctuation Does

» Named Entity Extraction from Noisy Input Speech and OCR

» Syntacticallyinformed models for comma prediction

» Multisensory speech processing incorporating automatically extracted hidden dynamic inform...

» Formatting TimeAligned ASR Transcripts for Readability

» Experiments on Sentence Boundary Detection

Post Info
More Details (n/a)

Added	30 May 2010
Updated	30 May 2010
Type	Conference
Year	2008
Where	ICASSP
Authors	Benoît Favre, Ralph Grishman, Dustin Hillard, Heng Ji, Dilek Hakkani-Tür, Mari Ostendorf

Comments (0)

Sciweavers

Punctuating speech for information extraction

Comma Decision Thresholds | Comma Prediction | Commas | ICASSP 2008 | Signal Processing |

Explore & Download

Productivity Tools

Sciweavers