Ontology-Based Extraction and Structuring of Information from Data-Rich Unstructured Documents

15 years 10 months ago

Download pages.cs.wisc.edu

We present a new approach to extracting information from unstructured documents based on an application ontology that describes a domain of interest. Starting with such an ontology, we formulate rules to extract constants and context keywords from unstructured documents. For each unstructured document of interest, we extract its constants and keywords and apply a recognizer to organize extracted constants as attribute values of tuples in a generated database schema. To make our approach general, we fix all the processes and change only the ontological description for a different application domain. In experiments we conducted on two different types of unstructured documents taken from the Web, our approach attained recall ratios in the 80% and 90% range and precision ratios near 98%.

David W. Embley, Douglas M. Campbell, Randy D. Smi

Real-time Traffic

Application Ontology | CIKM 1998 | Extracted Constants | Information Management | Unstructured Documents |

claim paper

» An Efficient OntologyBased Expert Peering System

» A Mutually Beneficial Integration of Data Mining and Information Extraction

» TimeTrails A System for Exploring SpatioTemporal Information in Documents

» Business Specific Online Information Extraction from German Websites

» RuleBased Information Extraction for Structured Data Acquisition using TextMarker

» TemplateBased Information Mining from HTML Documents

» Ontologybased design information extraction and retrieval

» Mapping enterprise entities to text segments

Post Info
More Details (n/a)

Added	05 Aug 2010
Updated	05 Aug 2010
Type	Conference
Year	1998
Where	CIKM
Authors	David W. Embley, Douglas M. Campbell, Randy D. Smith, Stephen W. Liddle

Comments (0)

Sciweavers

Ontology-Based Extraction and Structuring of Information from Data-Rich Unstructured Documents

Application Ontology | CIKM 1998 | Extracted Constants | Information Management | Unstructured Documents |

Explore & Download

Productivity Tools

Sciweavers