Sciweavers

VLDB
2007
ACM

A Relational Approach to Incrementally Extracting and Querying Structure in Unstructured Data

14 years 5 months ago
A Relational Approach to Incrementally Extracting and Querying Structure in Unstructured Data
There is a growing consensus that it is desirable to query over the structure implicit in unstructured documents, and that ideally this capability should be provided incrementally. However, there is no consensus about what kind of system should be used to support this kind of incremental capability. We explore using a relational system as the basis for a workbench for extracting and querying structure from unstructured data. As a proof of concept, we applied our relational approach to support structured queries over Wikipedia. We show that the data set is always available for some form of querying, and that as it is processed, users can pose a richer set of structured queries. We also provide examples of how we can incrementally evolve our understanding of the data in the context of the relational workbench.
Eric Chu, Akanksha Baid, Ting Chen, AnHai Doan, Je
Added 09 Jun 2010
Updated 09 Jun 2010
Type Conference
Year 2007
Where VLDB
Authors Eric Chu, Akanksha Baid, Ting Chen, AnHai Doan, Jeffrey F. Naughton
Comments (0)