Malleability enables a parallel application’s execution system to split or merge processes modifying granularity. While process migration is widely used to adapt applications to dynamic execution environments, it is limited by the granularity of the application’s processes. Malleability empowers process migration by allowing the application’s processes to expand or shrink following the availability of resources. We have implemented malleability as an extension to the PCM (Process Checkpointing and Migration) library, a user-level library for iterative MPI applications. PCM is integrated with the Internet Operating System (IOS), a framework for middleware-driven dynamic application reconfiguration. Our approach requires minimal code modifications and enables transparent middlewaretriggered reconfiguration. Experimental results using a two-dimensional data parallel program that has a regular communication structure demonstrate the usefulness of malleability.
Kaoutar El Maghraoui, Travis J. Desell, Boleslaw K