Approximating the joint data distribution of a multi-dimensional data set through a compact and accurate histogram synopsis is a fundamental problem arising in numerous practical ...
Amol Deshpande, Minos N. Garofalakis, Rajeev Rasto...
This paper proposes and evaluates Prefetching B+ -Trees pB+ -Trees, which use prefetching to accelerate two important operations on B+ -Tree indices: searches and range scans. To ...
Over the last decades, improvements in CPU speed have outpaced improvements in main memory and disk access rates by orders of magnitude, enabling the use of data compression techn...
The ability to approximately answer aggregation queries accurately and efficiently is of great benefit for decision support and data mining tools. In contrast to previous sampling...
In this paper, we investigate how to scale hierarchical clustering methods (such as OPTICS) to extremely large databases by utilizing data compression methods (such as BIRCH or ra...
Markus M. Breunig, Hans-Peter Kriegel, Peer Kr&oum...
In this paper we present a method for automatically segmenting unformatted text records into structured elements. Several useful data sources today are human-generated as continuo...
Vinayak R. Borkar, Kaustubh Deshmukh, Sunita Saraw...