Sentence-Based Active Learning Strategies for Information Extraction

14 years 29 days ago

Download sunsite.informatik.rwth-aachen.de

Given a classifier trained on relatively few training examples, active learning (AL) consists in ranking a set of unlabeled examples in terms of how informative they would be, if manually labeled, for retraining a (hopefully) better classifier. An important text learning task in which AL is potentially useful is information extraction (IE), namely, the task of identifying within a text the expressions that instantiate a given concept. We contend that, unlike in other text learning tasks, IE is unique in that it does not make sense to rank individual items (i.e., word occurrences) for annotation, and that the minimal unit of text that is presented to the annotator should be an entire sentence. In this paper we propose a range of active learning strategies for IE that are based on ranking individual sentences, and experimentally compare them on a standard dataset for named entity extraction. Keywords Information extraction, named entity recognition, active learning, selective sampling

Andrea Esuli, Diego Marcheggiani, Fabrizio Sebasti

Real-time Traffic

Active Learning | IIR 2010 | Information Extraction | Information Technology | Text Learning Tasks |

claim paper

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2010
Where	IIR
Authors	Andrea Esuli, Diego Marcheggiani, Fabrizio Sebastiani

Comments (0)

Sciweavers

Sentence-Based Active Learning Strategies for Information Extraction

Active Learning | IIR 2010 | Information Extraction | Information Technology | Text Learning Tasks |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers