Given a dataset P, a k-means query returns k points in space (called centers), such that the average squared distance between each point in P and its nearest center is minimized. Since this problem is NP-hard, several approximate algorithms have been proposed and used in practice. In this paper, we study continuous k-means computation at a server that monitors a set of moving objects. Re-evaluating k-means every time there is an object update imposes a heavy burden on the server (for computing the centers from scratch) and the clients (for continuously sending location updates). We overcome these problems with a novel approach that significantly reduces the computation and communication costs, while guaranteeing that the quality of the solution, with respect to the re-evaluation approach, is bounded by a user-defined tolerance. The proposed method assigns each moving object a threshold (i.e., range) such that the object sends a location update only when it crosses the range boundary. F...
Zhenjie Zhang, Yin Yang, Anthony K. H. Tung, Dimit