Virtually all histograms store for each bucket the number of distinct values it contains and their average frequency. In this paper, we question this paradigm. We start out by inv...
Commercial databases compete for market share, which is composed of not only net-new sales to those purchasing a database for the first time, but also competitive “win-backs”...
Reynold Xin, Patrick Dantressangle, Sam Lightstone...
To evaluate the performance of database applications and DBMSs, we usually execute workloads of queries on generated databases of different sizes and measure the response time. Th...
This demonstration presents QueRIE, a recommender system that supports interactive database exploration. This system aims at assisting non-expert users of scientific databases by...
We propose the k-representative regret minimization query (k-regret) as an operation to support multi-criteria decision making. Like top-k, the k-regret query assumes that users h...
Danupon Nanongkai, Atish Das Sarma, Ashwin Lall, R...
Data ambiguity is inherent in applications such as data integration, location-based services, and sensor monitoring. In many situations, it is possible to “clean”, or remove, ...
Reynold Cheng, Eric Lo, Xuan Yang, Ming-Hay Luk, X...
We propose the demonstration of Keymantic, a system for keyword-based searching in relational databases that does not require a-priori knowledge of instances held in a database. I...
Sonia Bergamaschi, Elton Domnori, Francesco Guerra...
Recently Sarathy and Muralidhar (2009) provided the first attempt at illustrating the implementation of differential privacy for numerical data. In this paper, we attempt to provid...
Minimum-distance controlled tabular adjustment methods (CTA), and its restricted variants (RCTA), is a recent perturbative approach for tabular data protection. Given a table to be...
Statistical agencies release microdata to researchers after applying statistical disclosure control (SDC) methods. Noise addition is a perturbative SDC method which is carried out...