We consider the problem of query optimization in distributed stream based systems where multiple continuous queries may be executing simultaneously. In such systems, distribution adds degrees of freedom to an already complex optimization problem. Thousands of network nodes may need to be considered for operator placements in order to support in-network processing - clearly overwhelming even from the perspective of distributed query optimization. Added to this complexity is the potential for significant savings by combining query plans in order to re-use the stream of intermediate results. These issues force us to develop new techniques for query optimization. We present a formal definition of the multi-query optimization problem in such systems and propose some initial directions.
Sangeetha Seshadri, Vibhore Kumar, Brian F. Cooper