Abstract— Improving data quality is a time-consuming, laborintensive and often domain specific operation. A recent principled approach for repairing dirty database is to use dat...
Mohamed Yakout, Ahmed K. Elmagarmid, Jennifer Nevi...
— To cope with bursty arrivals of high-volume data, a DSMS has to shed load while minimizing the degradation of Quality of Service (QoS). In this paper, we show that this problem...
Abstract— Large graph datasets are ubiquitous in many domains, including social networking and biology. Graph summarization techniques are crucial in such domains as they can ass...
Extract-Transform-Load (ETL) processes play an important role in data warehousing. Typically, design work on ETL has focused on performance as the sole metric to make sure that the...
Alkis Simitsis, Kevin Wilkinson, Umeshwar Dayal, M...
Estimation via sampling out of highly selective join queries is well known to be problematic, most notably in online aggregation. Without goal-directed sampling strategies, samples...