Internet today, has transformed into a global information hub. The increase in its usage and magnitude have sparkled various research problems. Because of the diverse user populat...
We give the first optimal algorithm for estimating the number of distinct elements in a data stream, closing a long line of theoretical research on this problem begun by Flajolet...
Histograms are typically used to approximate data distributions. Histograms and related synopsis structures have been successful in a wide variety of popular database applications...
Clustering is a common problem in the analysis of large data sets. Streaming algorithms, which make a single pass over the data set using small working memory and produce a cluster...
In many applications, one has to actively select among a set of expensive observations before making an informed decision. Often, we want to select observations which perform well...
Andreas Krause, H. Brendan McMahan, Carlos Guestri...