Sciweavers

ICDE
2010
IEEE

Global Iceberg Detection over Distributed Data Streams

14 years 11 months ago
Global Iceberg Detection over Distributed Data Streams
In today's Internet applications or sensor networks we often encounter large amounts of data spread over many physically distributed nodes. The sheer volume of the data and bandwidth constraints make it impractical to send all the data to one central node for query processing. Finding distributed icebergs--elements that may have low frequency at individual nodes but high aggregate frequency--is a problem that arises commonly in practice. In this paper we present a novel algorithm with two notable properties. First, its accuracy guarantee and communication cost are independent of the way in which element counts (for both icebergs and non-icebergs) are split amongst the nodes. Second, it works even when each distributed data set is a stream (i.e., one pass data access only). Our algorithm builds upon sketches constructed for the estimation of the second frequency moment (F2) of data streams. The intuition of our idea is that when there are global icebergs in the union of these data ...
Ashwin Lall, Haiquan (Chuck) Zhao, Jun Xu, Mitsuno
Added 20 Dec 2009
Updated 03 Jan 2010
Type Conference
Year 2010
Where ICDE
Authors Ashwin Lall, Haiquan (Chuck) Zhao, Jun Xu, Mitsunori Ogihara
Comments (0)