Idle desktops have been successfully used to run sequential and master-slave task parallel codes on a large scale in the context of volunteer computing. However, execution of message passing parallel programs in such environments is challenging because a pool of nodes to execute an application may have architectural and operating system heterogeneity, can include widely distributed nodes across security domains, and nodes may become unavailable for computation frequently and without warning. The VolPEx (Parallel Execution on Volatile Nodes) tool set is building MPI support in such environments based on selective use of process redundancy and message logging. However, addressing this challenge requires tradeoffs between performance, portability, and usability. The paper introduces a robust MPI library that is designed to be highly portable across heterogeneous architectures and operating systems. This VolpexPyMPI1 library is built with Python, works with Linux and Windows platforms and ...
Troy P. LeBlanc, Jaspal Subhlok, Edgar Gabriel