Sciweavers

IJCAI
2001

Adaptive Information Extraction from Text by Rule Induction and Generalisation

14 years 1 months ago
Adaptive Information Extraction from Text by Rule Induction and Generalisation
(LP)2 is a covering algorithm for adaptive Information Extraction from text (IE). It induces symbolic rules that insert SGML tags into texts by learning from examples found in a userdefined tagged corpus. Training is performed in two steps: initially a set of tagging rules is learned; then additional rules are induced to correct mistakes and imprecision in tagging. Induction is performed by bottom-up generalization of examples in the training corpus. Shallow knowledge about Natural Language Processing (NLP) is used in the generalization process. The algorithm has a considerable success story. From a scientific point of view, experiments report excellent results with respect to the current state of the art on two publicly available corpora. From an application point of view, a successful industrial IE tool has been based on (LP)2 . Real world applications have been developed and licenses have been released to external companies for building other applications. This paper presents (LP)2...
Fabio Ciravegna
Added 31 Oct 2010
Updated 31 Oct 2010
Type Conference
Year 2001
Where IJCAI
Authors Fabio Ciravegna
Comments (0)