Linear optimization queries retrieve the top-K tuples in a sliding window of a data stream that maximize/minimize the linearly weighted sums of certain attribute values. To efficiently answer such queries against a large relation, an onion index was previously proposed to properly organize all the tuples in the relation. However, such an onion index does not work in a streaming environment due to fast tuple arrival rate and limited memory. In this paper, we propose a SAO index to approximately answer arbitrary linear optimization queries against a data stream. It uses a small amount of memory to efficiently keep track of the most "important" tuples in a sliding window of a data stream. The index maintenance cost is small because the great majority of the incoming tuples do not cause any changes to the index and are quickly discarded. At any time, for any linear optimization query, we can retrieve from the SAO index the approximate top-K tuples in the sliding window almost in...
Gang Luo, Kun-Lung Wu, Philip S. Yu