Sciweavers

PODS
2003
ACM

Maintaining variance and k-medians over data stream windows

14 years 11 months ago
Maintaining variance and k-medians over data stream windows
The sliding window model is useful for discounting stale data in data stream applications. In this model, data elements arrive continually and only the most recent N elements are used when answering queries. We present a novel technique for solving two important and related problems in the sliding window model -- maintaining variance and maintaining a k? median clustering. Our solution to the problem of maintaining variance provides a continually updated estimate of the variance of the last N values in a data stream with relative error of at most using O( 1 2 log N) memory. We present a constant-factor approximation algorithm which maintains an approximate k?median solution for the last N data points using O( k 4 N2 log2 N) memory, where < 1/2 is a parameter which trades off the space bound with the approximation factor of O(2O(1/) ).
Brian Babcock, Mayur Datar, Rajeev Motwani, Liadan
Added 08 Dec 2009
Updated 08 Dec 2009
Type Conference
Year 2003
Where PODS
Authors Brian Babcock, Mayur Datar, Rajeev Motwani, Liadan O'Callaghan
Comments (0)