Optimal load shedding with aggregates and mining queries

15 years 9 months ago

Download www.cs.ucla.edu

— To cope with bursty arrivals of high-volume data, a DSMS has to shed load while minimizing the degradation of Quality of Service (QoS). In this paper, we show that this problem can be formalized as a classical optimization task from operations research, in ways that accommodate different requirements for multiple users, different query sensitivities to load shedding, and different penalty functions. Standard nonlinear programming algorithms are adequate for non-critical situations, but for severe overloads, we propose a more efﬁcient algorithm that runs in linear time, without compromising optimality. Our approach is applicable to a large class of queries including traditional SQL aggregates, statistical aggregates (e.g., quantiles), and data mining functions, such as k-means, naive Bayesian classiﬁers, decision trees, and frequent pattern discovery (where we can even specify a different error bound for each pattern). In fact, we show that these aggregate queries are special in...

Barzan Mozafari, Carlo Zaniolo

Real-time Traffic