Sciweavers

SIGMOD
2008
ACM

Building query optimizers for information extraction: the SQoUT project

15 years 17 days ago
Building query optimizers for information extraction: the SQoUT project
Text documents often embed data that is structured in nature. This structured data is increasingly exposed using information extraction systems, which generate structured relations from documents, introducing an opportunity to process expressive, structured queries over text databases. This paper discusses our SQoUT1 project, which focuses on processing structured queries over relations extracted from text databases. We show how, in our extraction-based scenario, query processing can be decomposed into a sequence of basic steps: retrieving relevant text documents, extracting relations from the documents, and joining extracted relations for queries involving multiple relations. Each of these steps presents different alternatives and together they form a rich space of possible query execution strategies. We identify execution efficiency and output quality as the two critical properties of a query execution, and argue that an optimization approach needs to consider both properties. To th...
Alpa Jain, Panagiotis G. Ipeirotis, Luis Gravano
Added 08 Dec 2009
Updated 08 Dec 2009
Type Conference
Year 2008
Where SIGMOD
Authors Alpa Jain, Panagiotis G. Ipeirotis, Luis Gravano
Comments (0)