Sharing aggregate computation for distributed queries

15 years 3 months ago

Download db.cs.berkeley.edu

An emerging challenge in modern distributed querying is to efficiently process multiple continuous aggregation queries simultaneously. Processing each query independently may be infeasible, so multi-query optimizations are critical for sharing work across queries. The challenge is to identify overlapping computations that may not be obvious in the queries themselves. In this paper, we reveal new opportunities for sharing work in the context of distributed aggregation queries that vary in their selection predicates. We identify settings in which a large set of q such queries can be answered by executing k q different queries. The k queries are revealed by analyzing a boolean matrix capturing the connection between data and the queries that they satisfy, in a manner akin to familiar techniques like Gaussian elimination. Indeed, we identify a class of linear aggregate functions (including SUM, COUNT and AVERAGE), and show that the sharing potential for such queries can be optimally reco...

Ryan Huebsch, Minos N. Garofalakis, Joseph M. Hell

Real-time Traffic

Continuous Aggregation Queries | Database | Optimal Sharing Maps | SIGMOD 2007 | Typical Aggregation Functions |

claim paper

Post Info
More Details (n/a)

Added	08 Dec 2009
Updated	08 Dec 2009
Type	Conference
Year	2007
Where	SIGMOD
Authors	Ryan Huebsch, Minos N. Garofalakis, Joseph M. Hellerstein, Ion Stoica

Comments (0)

Sciweavers

Sharing aggregate computation for distributed queries

Continuous Aggregation Queries | Database | Optimal Sharing Maps | SIGMOD 2007 | Typical Aggregation Functions |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers