Sciweavers

ICDE
2007
IEEE

Query-Aware Sampling for Data Streams

14 years 6 months ago
Query-Aware Sampling for Data Streams
Data Stream Management Systems are useful when large volumes of data need to be processed in real time. Examples include monitoring network traffic, monitoring financial transactions, and analyzing large scale scientific data feeds. These applications have varying data rates and often show bursts of high activity that overload the system, often during the most critical instants (e.g., network attacks, financial spikes) for analysis. Therefore, load shedding is necessary to preserve the stability of the system, gracefully degrade its performance and extract answers. Existing methods for load shedding in a general purpose data stream query system use random sampling of tuples, essentially independent of the query. While this technique is acceptable for some queries, the results may be meaningless or even incorrect for other queries. In principle, a number of different query-dependent sampling methods exist, but they work only for particular queries. In this paper, we show how to perform...
Theodore Johnson, S. Muthukrishnan, Vladislav Shka
Added 03 Jun 2010
Updated 03 Jun 2010
Type Conference
Year 2007
Where ICDE
Authors Theodore Johnson, S. Muthukrishnan, Vladislav Shkapenyuk, Oliver Spatscheck
Comments (0)