To search or to crawl?: towards a query optimizer for text-centric tasks

16 years 7 months ago

Download pages.stern.nyu.edu

Text is ubiquitous and, not surprisingly, many important applications rely on textual data for a variety of tasks. As a notable example, information extraction applications derive structured relations from unstructured text; as another example, focused crawlers explore the web to locate pages about specific topics. Execution plans for text-centric tasks follow two general paradigms for processing a text database: either we can scan, or "crawl," the text database or, alternatively, we can exploit search engine indexes and retrieve the documents of interest via carefully crafted queries constructed in task-specific ways. The choice between crawl- and query-based execution plans can have a substantial impact on both execution time and output "completeness" (e.g., in terms of recall). Nevertheless, this choice is typically ad-hoc and based on heuristics or plain intuition. In this paper, we present fundamental building blocks to make the choice of execution plans for t...

Panagiotis G. Ipeirotis, Eugene Agichtein, Pranay

Real-time Traffic

Appropriate Execution Plans | Database | Execution Plans | Query-based Execution Plans | SIGMOD 2006 |

claim paper

» Large Scale Query Log Analysis of ReFinding

» Monitoring the dynamic web to respond to continuous queries

Post Info
More Details (n/a)

Added	08 Dec 2009
Updated	08 Dec 2009
Type	Conference
Year	2006
Where	SIGMOD
Authors	Panagiotis G. Ipeirotis, Eugene Agichtein, Pranay Jain, Luis Gravano

Comments (0)

Sciweavers

To search or to crawl?: towards a query optimizer for text-centric tasks

Appropriate Execution Plans | Database | Execution Plans | Query-based Execution Plans | SIGMOD 2006 |

Explore & Download

Productivity Tools

Sciweavers