Estimation via sampling out of highly selective join queries is well known to be problematic, most notably in online aggregation. Without goal-directed sampling strategies, samples...
We consider the distributed computation of a function of random sources with minimal communication. Specifically, given two discrete memoryless sources, X and Y , a receiver wishe...
We present a simple and practical algorithm for the c-approximate near neighbor problem (c-NN): given n points P Rd and radius R, build a data structure which, given q Rd , can ...
We consider the problem of approximating a set P of n points in Rd by a j-dimensional subspace under the p measure, in which we wish to minimize the sum of p distances from each p...
Dan Feldman, Morteza Monemizadeh, Christian Sohler...
Abstract. Three simple and explicit procedures for testing the independence of two multi-dimensional random variables are described. Two of the associated test statistics (L1, log-...