Sciweavers

JUCS
2007

Improving the Performance of a Tagger Generator in an Information Extraction Application

13 years 11 months ago
Improving the Performance of a Tagger Generator in an Information Extraction Application
: In this paper we present an experience in the extraction of named entities from Spanish texts using stacking. Named Entity Extraction (NEE) is a subtask of Information Extraction that involves the identification of groups of words that make up the name of an entity, and the classification of these names into a set of predefined categories. Our approach is corpus-based, we use a re-trainable tagger generator to obtain a named entity extractor from a set of tagged examples. The main contribution of our work is that we obtain the systems needed in a stacking scheme without making use of any additional training material or tagger generators. Instead of it, we have generated the variability needed in stacking by applying corpus transformation to the original training corpus. Once we have several versions of the training corpus we generate several extractors and combine them by means of a machine learning algorithm. Experiments show that the combination of corpus transformation and stac...
José A. Troyano, Fernando Enríquez,
Added 16 Dec 2010
Updated 16 Dec 2010
Type Journal
Year 2007
Where JUCS
Authors José A. Troyano, Fernando Enríquez, Fermín Cruz, José Miguel Cañete Valdeón, F. Javier Ortega
Comments (0)