SystemT: a system for declarative information extraction

16 years 6 months ago

Download www.sigmod.org

As applications within and outside the enterprise encounter increasing volumes of unstructured data, there has been renewed interest in the area of information extraction (IE) ? the discipline concerned with extracting structured information from unstructured text. Classical IE techniques developed by the NLP community were based on cascading grammars and regular expressions. However, due to the inherent limitations of grammarbased extraction, these techniques are unable to: (i) scale to large data sets, and (ii) support the expressivity requirements of complex information tasks. At the IBM Almaden Research Center, we are developing SystemT, an IE system that addresses these limitations by adopting an algebraic approach. By leveraging well-understood database concepts such as declarative queries and costbased optimization, SystemT enables scalable execution of complex information extraction tasks. In this paper, we motivate the SystemT approach to information extraction. We describe o...

Rajasekar Krishnamurthy, Yunyao Li, Sriram Raghava

Real-time Traffic

Complex Extraction Tasks | Database | Extraction Algebra | Information Extraction Tasks | SIGMOD 2008 |

claim paper

» SystemT An Algebraic Approach to Declarative Information Extraction

» Enabling enterprise mashups over unstructured text feeds with InfoSphere MashupHub and Sys...

» The SystemT IDE an integrated development environment for information extraction rules

» Automatic Rule Refinement for Information Extraction

» Hybrid indatabase inference for declarative information extraction

» Declarative Information Extraction Web Crawling and Recursive Wrapping with Lixto

» Probabilistic Declarative Information Extraction

» Declarative analysis of noisy information networks

Post Info
More Details (n/a)

Added	08 Dec 2009
Updated	08 Dec 2009
Type	Conference
Year	2008
Where	SIGMOD
Authors	Rajasekar Krishnamurthy, Yunyao Li, Sriram Raghavan, Frederick Reiss, Shivakumar Vaithyanathan, Huaiyu Zhu

Comments (0)

Sciweavers

SystemT: a system for declarative information extraction

Complex Extraction Tasks | Database | Extraction Algebra | Information Extraction Tasks | SIGMOD 2008 |

Explore & Download

Productivity Tools

Sciweavers