We study the problem of computing waveletbased synopses for massive data sets in static and streaming environments. A compact representation of a data set is obtained after a thre...
We define a natural notion of efficiency for approximate nearest-neighbor (ANN) search in general n-point metric spaces, namely the existence of a randomized algorithm which answ...
We study algorithms for clustering data that were recently proposed by Balcan, Blum and Gupta in SODA’09 [4] and that have already given rise to two follow-up papers. The input f...
We consider k-median clustering in finite metric spaces and k-means clustering in Euclidean spaces, in the setting where k is part of the input (not a constant). For the k-means pr...
We present a clustering scheme that combines a mode-seeking phase with a cluster merging phase in the corresponding density map. While mode detection is done by a standard graph-b...