Sciweavers

IPPS
2005
IEEE

Combining FT-MPI with H2O: Fault-Tolerant MPI Across Administrative Boundaries

14 years 6 months ago
Combining FT-MPI with H2O: Fault-Tolerant MPI Across Administrative Boundaries
We observe increasing interest in aggregating geographically distributed, heterogeneous resources to perform large scale computations. MPI remains the most popular programming paradigm for such applications; however, as the size of computing environments increases, fault tolerance aspects become critically important. We argue that the fault tolerance model proposed by FT-MPI fits well in geographically distributed environments, even though its current implementation is confined to a single administrative domain. We propose to overcome these limitations by combining FT-MPI with the H2O resource sharing framework. Our approach allows users to run fault tolerant MPI programs on heterogeneous, geographically distributed shared machines, without sacrificing performance and with minimal involvement of resource providers.
Dawid Kurzyniec, Vaidy S. Sunderam
Added 25 Jun 2010
Updated 25 Jun 2010
Type Conference
Year 2005
Where IPPS
Authors Dawid Kurzyniec, Vaidy S. Sunderam
Comments (0)