Adaptive parallel computations--computations that can adapt to changes in resource availability and requirement--can effectively use networked machines because they dynamically expand as machines become available and dynamically acquire machines as needed. While most parallel programming systems provide the means to develop adaptive programs, they do not provide any functional interface to external resource management systems. Thus, no existing resource management system has the capability to manage resources on commodity system software, arbitrating the demands of multiple adaptive computations written using diverse programming environments. This paper presents a set of novel mechanisms that facilitate dynamic allocation of resources to adaptive parallel computations. The mechanisms are built on low-level features common to many programming systems, and unique in their ability to transparently manage multiple adaptive parallel programs that were not developed to have their resources ...
Arash Baratloo, Ayal Itzkovitz, Zvi M. Kedem, Yuan