Query optimizers in current database systems are designed to pick a single efficient plan for a given query based on current statistical properties of the data. However, different subsets of the data can sometimes have very different statistical properties. In such scenarios it can be more efficient to process different subsets of the data for a query using different plans. We propose a new query processing technique called content-based routing (CBR) that eliminates the single-plan restriction in current systems. We present low-overhead adaptive algorithms that partition input data based on statistical properties relevant to query execution strategies, and efficiently route individual tuples through customized plans based on their partition. We have implemented CBR as an extension to the Eddies query processor in the TelegraphCQ system, and we present an extensive experimental evaluation showing the significant performance benefits of CBR.
Pedro Bizarro, Shivnath Babu, David J. DeWitt, Jen