Adaptive Information Extraction from Text by Rule Induction and Generalisation

14 years 10 months ago

Download eprints.aktors.org

(LP)2 is a covering algorithm for adaptive Information Extraction from text (IE). It induces symbolic rules that insert SGML tags into texts by learning from examples found in a userdefined tagged corpus. Training is performed in two steps: initially a set of tagging rules is learned; then additional rules are induced to correct mistakes and imprecision in tagging. Induction is performed by bottom-up generalization of examples in the training corpus. Shallow knowledge about Natural Language Processing (NLP) is used in the generalization process. The algorithm has a considerable success story. From a scientific point of view, experiments report excellent results with respect to the current state of the art on two publicly available corpora. From an application point of view, a successful industrial IE tool has been based on (LP)2 . Real world applications have been developed and licenses have been released to external companies for building other applications. This paper presents (LP)2...

Fabio Ciravegna

Real-time Traffic

Adaptive Information Extraction | Bottom-up Generalization | IJCAI 2001 | IJCAI 2007 | Successful Industrial Ie |

claim paper

Post Info
More Details (n/a)

Added	31 Oct 2010
Updated	31 Oct 2010
Type	Conference
Year	2001
Where	IJCAI
Authors	Fabio Ciravegna

Comments (0)

Sciweavers

Adaptive Information Extraction from Text by Rule Induction and Generalisation

Adaptive Information Extraction | Bottom-up Generalization | IJCAI 2001 | IJCAI 2007 | Successful Industrial Ie |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers