Sciweavers

CLUSTER
2006
IEEE

FAIL-MPI: How Fault-Tolerant Is Fault-Tolerant MPI?

14 years 4 months ago
FAIL-MPI: How Fault-Tolerant Is Fault-Tolerant MPI?
One of the topics of paramount importance in the development of Cluster and Grid middleware is the impact of faults since their occurrence in Grid infrastructures and in large-scale distributed systems is common. MPI (Message Passing Interface) is a popular abstraction for programming distributed and parallel applications. FAIL (FAult Injection Language) is an abstract language for fault occurrence description capable of expressing complex and realistic fault scenarios. In this paper, we investigate the possibility of using FAIL to inject faults in a fault-tolerant MPI implementation. Our middleware, FAIL-MPI, is used to carry quantitative and qualitative faults and stress testing.
William Hoarau, Pierre Lemarinier, Thomas Hé
Added 10 Jun 2010
Updated 10 Jun 2010
Type Conference
Year 2006
Where CLUSTER
Authors William Hoarau, Pierre Lemarinier, Thomas Hérault, Eric Rodriguez, Sébastien Tixeuil, Franck Cappello
Comments (0)