—We demonstrate DPCube, a component in our Health Information DE-identification (HIDE) framework, for releasing differentially private data cubes (or multi-dimensional histogram...
Disk and network latency must be taken into account when applying parallel computing to large multidimensional datasets because they can hinder performance by reducing the rate at...
Abstract. We present a cost-based adaptive clustering method to improve average performance of spatial queries (intersection, containment, enclosure queries) over large collections...
Data-warehousing applications cope with enormous data sets in the range of Gigabytes and Terabytes. Queries usually either select a very small set of this data or perform aggregat...
Partial replication is one type of optimization to speed up execution of queries submitted to large datasets. In partial replication, a portion of the dataset is extracted, re-org...