Data stream applications have made use of statistical summaries to reason about the data using nonparametric tools such as histograms, heavy hitters, and join sizes. However, rela...
Database columns are often correlated, so that cardinality estimates computed by assuming independence often lead to a poor choice of query plan by the optimizer. Multidimensional...
Utkarsh Srivastava, Peter J. Haas, Volker Markl, M...
We propose a fast algorithm, EMD-L1, for computing the Earth Mover's Distance (EMD) between a pair of histograms. Compared to the original formulation, EMD-L1 has a largely si...
The earth mover's distance (EMD) [16] is an important perceptually meaningful metric for comparing histograms, but it suffers from high (O(N3 log N)) computational complexity...
We present algorithms for fast quantile and frequency estimation in large data streams using graphics processor units (GPUs). We exploit the high computational power and memory ba...
Naga K. Govindaraju, Nikunj Raghuvanshi, Dinesh Ma...