To deal with data uncertainty, existing probabilistic database systems augment tuples with attribute-level or tuple-level probability values, which are loaded into the database al...
Ravi Jampani, Fei Xu, Mingxi Wu, Luis Leopoldo Per...
— One of the most prominent data quality problems is the existence of duplicate records. Current data cleaning systems usually produce one clean instance (repair) of the input da...
George Beskales, Mohamed A. Soliman, Ihab F. Ilyas...
Abstract-- When dealing with massive quantities of data, topk queries are a powerful technique for returning only the k most relevant tuples for inspection, based on a scoring func...
Most operations of the relational algebra or SQL require a sorted stream of tuples for efficient processing. Therefore, processing complex relational queries relies on efficient a...
A key advantage of scientific workflow systems over traditional scripting approaches is their ability to automatically record data and process dependencies introduced during workf...