Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

225

LWA
2008

220views Software Engineering» more LWA 2008»

Rule-Based Information Extraction for Structured Data Acquisition using TextMarker

15 years 8 months ago

Rule-Based Information Extraction for Structured Data Acquisition using TextMarker

Download ki.informatik.uni-wuerzburg.de

Information extraction is concerned with the location of specific items in (unstructured) textual documents, e.g., being applied for the acquisition of structured data. Then, the acquired data can be applied for mining methods requiring structured input data, in contrast to other text mining methods that utilize a bag-of-words approach. This paper presents a semi-automatic approach for structured data acquisition using a rule-based information extraction system. We propose a semi-automatic process model that includes the TEXTMARKER system for information extraction and data acquisition from textual documents. TEXTMARKER applies simple rules for extracting blocks from a given (semi-structured) document, which can be further analyzed using domain-specific rules. Thus, both low-level and higher-level information extraction is supported. We demonstrate the applicability and benefit of the approach with two case studies of two realworld applications.

Martin Atzmüller, Peter Klügl, Frank Pup

Real-time Traffic

Information Extraction | LWA 2008 | Software Engineering | Structured Data | Textual Documents |

claim paper

Related Content

» RuleBased Generation of XML Schemas from UML Class Diagrams

» Experiments in GraphBased SemiSupervised Learning Methods for ClassInstance Acquisition

» The SystemT IDE an integrated development environment for information extraction rules

» Assessment of ARMAX Structure as a Global Model for SelfRefilling Steam Distillation Essen...

» A framework for specifying explicit bias for revision of approximate information extractio...

» Mining conceptual graphs for knowledge acquisition

» Growing a tree in the forest constructing folksonomies by integrating structured metadata

» A Propositional Approach to Textual Case Indexing

» Semisupervised learning of semantic classes for query understanding from the web and for t...

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2008
Where	LWA
Authors	Martin Atzmüller, Peter Klügl, Frank Puppe

Comments (0)