Improving Mention Detection Robustness to Noisy Input

13 years 10 months ago

Download www.aclweb.org

Information-extraction (IE) research typically focuses on clean-text inputs. However, an IE engine serving real applications yields many false alarms due to less-well-formed input. For example, IE in a multilingual broadcast processing system has to deal with inaccurate automatic transcription and translation. The resulting presence of non-target-language text in this case, and non-language material interspersed in data from other applications, raise the research problem of making IE robust to such noisy input text. We address one such IE task: entity-mention detection. We describe augmenting a statistical mention-detection system in order to reduce false alarms from spurious passages. The diverse nature of input noise leads us to pursue a multi-faceted approach to robustness. For our English-language system, at various miss rates we eliminate 97% of false alarms on inputs from other Latin-alphabet languages. In another experiment, representing scenarios in which genre-specific traini...

Radu Florian, John F. Pitrelli, Salim Roukos, Imed

Real-time Traffic

EMNLP 2010 | False Alarms | Inaccurate Automatic Transcription | Natural Language Processing | Noisy Input Text |

claim paper

Post Info
More Details (n/a)

Added	11 Feb 2011
Updated	11 Feb 2011
Type	Journal
Year	2010
Where	EMNLP
Authors	Radu Florian, John F. Pitrelli, Salim Roukos, Imed Zitouni

Comments (0)

Sciweavers

Improving Mention Detection Robustness to Noisy Input

EMNLP 2010 | False Alarms | Inaccurate Automatic Transcription | Natural Language Processing | Noisy Input Text |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers