An Algebraic Approach to Rule-Based Information Extraction

16 years 8 months ago

Download www.almaden.ibm.com

Traditional approaches to rule-based information extraction (IE) have primarily been based on regular expression grammars. However, these grammar-based systems have difficulty scaling to large data sets and large numbers of rules. Inspired by traditional database research, we propose an algebraic approach to rule-based IE that addresses these scalability issues through query optimization. The operators of our algebra are motivated by our experience in building several rule-based extraction programs over diverse data sets. We present the operators of our algebra and propose several optimization strategies motivated by the text-specific characteristics of our operators. Finally we validate the potential benefits of our approach by extensive experiments over real-world blog data.

Frederick Reiss, Sriram Raghavan, Rajasekar Krishn

Real-time Traffic

Database | ICDE 2008 | Rule-based Extraction Programs | Rule-based Ie | Rule-based Information Extraction |

claim paper

» RuleBased Information Extraction for Structured Data Acquisition using TextMarker

» Automatic Extraction of Definitions in Portuguese A RuleBased Approach

» A RuleBased Extensible Stemmer for Information Retrieval with Application to Arabic

» NARFO Algorithm Mining Nonredundant and Generalized Association Rules Based on Fuzzy Ontol...

» Rule based Autonomous Citation Mining with TIERL

» SystemT An Algebraic Approach to Declarative Information Extraction

» RuleBased Generation of XML Schemas from UML Class Diagrams

» An Intelligent Conversational Agent Approach to Extracting Queries from Natural Language

Post Info
More Details (n/a)

Added	01 Nov 2009
Updated	01 Nov 2009
Type	Conference
Year	2008
Where	ICDE
Authors	Frederick Reiss, Sriram Raghavan, Rajasekar Krishnamurthy, Huaiyu Zhu, Shivakumar Vaithyanathan

Comments (0)

Sciweavers

An Algebraic Approach to Rule-Based Information Extraction

Database | ICDE 2008 | Rule-based Extraction Programs | Rule-based Ie | Rule-based Information Extraction |

Explore & Download

Productivity Tools

Sciweavers