Overload management has been an important problem for large-scale dynamic systems. In this paper, we study this problem in the context of our Borealis distributed stream processing system. We show that server nodes must coordinate in their load shedding decisions to achieve global control on output quality. We describe a distributed load shedding approach which provides this coordination by upstream metadata aggregation and propagation. Metadata enables an upstream node to make fast local load shedding decisions which will influence its descendant nodes in the best possible way.
Nesime Tatbul, Stanley B. Zdonik