Sciweavers

SIGMOD
2005
ACM

On Joining and Caching Stochastic Streams

14 years 12 months ago
On Joining and Caching Stochastic Streams
We consider the problem of joining data streams using limited cache memory, with the goal of producing as many result tuples as possible from the cache. Many cache replacement heuristics have been proposed in the past. Their performance often relies on implicit assumptions about the input streams, e.g., that the join attribute values follow a relatively stationary distribution. However, in general and in practice, streams often exhibit more complex behaviors, such as increasing trends and random walks, rendering these "hardwired" heuristics inadequate. In this paper, we propose a framework that is able to exploit known or observed statistical properties of input streams to make cache replacement decisions aimed at maximizing the expected number of result tuples. To illustrate the complexity of the solution space, we show that even an algorithm that considers, at every time step, all possible sequences of future replacement decisions may not be optimal. We then identify a con...
Jun Yang 0001, Junyi Xie, Yuguo Chen
Added 08 Dec 2009
Updated 08 Dec 2009
Type Conference
Year 2005
Where SIGMOD
Authors Jun Yang 0001, Junyi Xie, Yuguo Chen
Comments (0)