Sciweavers

MLMI
2007
Springer

Automatic Labeling Inconsistencies Detection and Correction for Sentence Unit Segmentation in Conversational Speech

14 years 6 months ago
Automatic Labeling Inconsistencies Detection and Correction for Sentence Unit Segmentation in Conversational Speech
In conversational speech, irregularities in the speech such as overlaps and disruptions make it difficult to decide what is a sentence. Thus, despite very precise guidelines on how to label conversational speech with dialog acts (DA), labeling inconsistencies are likely to appear. In this work, we present various methods to detect labeling inconsistencies in the ICSI meeting corpus. We show that by automatically detecting and removing the inconsistent examples from the training data, we significantly improve the sentence segmentation accuracy. We then manually analyze 200 of noisy examples detected by the system and observe that only 13% of them are labeling inconsitencies, while the rest are errors done by the classifier. The errors naturally cluster into 5 main classes for each of which we give hints on how the system can be improved to avoid these mistakes. Key words: automatic relabeling, error correction, boosting, sentence segmentation, noisy data.
Sébastien Cuendet, Dilek Z. Hakkani-Tü
Added 08 Jun 2010
Updated 08 Jun 2010
Type Conference
Year 2007
Where MLMI
Authors Sébastien Cuendet, Dilek Z. Hakkani-Tür, Elizabeth Shriberg
Comments (0)