Abstract— Developing a full-fledged cost-based XQuery optimizer is a fairly complex task. Nowadays, there is little knowledge concerning suitable cost formulae and optimization ...
— Information extraction systems are traditionally implemented as a pipeline of special-purpose processing modules. A major drawback of such an approach is that whenever a new ex...
— One of the most prominent data quality problems is the existence of duplicate records. Current data cleaning systems usually produce one clean instance (repair) of the input da...
George Beskales, Mohamed A. Soliman, Ihab F. Ilyas...
Abstract— Large graph datasets are ubiquitous in many domains, including social networking and biology. Graph summarization techniques are crucial in such domains as they can ass...
— Commercial tuple extraction systems have enjoyed some success to extract tuples by regarding HTML pages as tree structures and exploiting XPath queries to find attributes of t...
— The increasing infectious disease outbreaks has led to a need for new research to better understand the disease’s origins, epidemiological features and pathogenicity caused b...
— Finding the k nearest neighbors (kNN) of a query point, or a set of query points (kNN-Join) are fundamental problems in many application domains. Many previous efforts to solve...
— To cope with bursty arrivals of high-volume data, a DSMS has to shed load while minimizing the degradation of Quality of Service (QoS). In this paper, we show that this problem...
Abstract— Checking if an SQL query has been written correctly is not an easy task. Formal verification is not applicable, since it is based on comparing a specification with an...
—Many modern optimizers use a transformation rule based framework. While there has been a lot of work on identifying new transformation rules, there has been little work focused ...
Surajit Chaudhuri, Leo Giakoumakis, Vivek R. Naras...