Main memory is a critical resource when processing non-blocking queries with state intensive operators that require real-time responses. While partitioned parallel processing can alleviate the stringent memory demands in some cases, in general even in a distributed system main memory remains bounded. In this work, we thus investigate the integration of two run-time adaptation techniques, namely, state spill to disk and state relocation to an alternate machine, to handle this memory shortage problem. We analyze the tradeoffs regarding key factors affecting these two runtime operator state adaptation techniques in a modern computecluster environment. Two strategies, lazy-disk and active-disk, are then proposed that integrate both state spill and state relocation adaptations with different emphasis on local versus global decision making. Extensive experiments of the proposed query processing system conducted on a compute-cluster (not merely a simulation) confirm the effectiveness of the...
Bin Liu, Mariana Jbantova, Elke A. Rundensteiner