Recent advances in computing and communication have given rise to the computational grid notion. The core of this computing paradigm is the design of a system for drawing compute power from a confederation of geographically dispersed heterogeneous resources, seamlessly and ubiquitously. If high-performance levels are to be achieved, data locality must be identified and managed. In this paper, we consider the effect of server side staging on the behavior of a class of wide area "task farming" applications. We show that staging improves task throughput mainly through the increased parallelism rather than the reduction in overall turnaround time per task. We derive a model for farming applications with and without server side staging and verify the model through live experiments as well as simulations.
Wael R. Elwasif, James S. Plank, Richard Wolski