Control decisions of intelligent devices in critical infrastructure can have a significant impact on human life and the environment. Insuring that the appropriate data is availabl...
— Fault tolerance in MPI becomes a main issue in the HPC community. Several approaches are envisioned from user or programmer controlled fault tolerance to fully automatic fault ...
Aurelien Bouteiller, Boris Collin, Thomas Hé...
In this paper, we present a comprehensive study on the threats towards the coordination services for Web services business activities and explore the most optimal solution to miti...
This paper describes a framework for achieving node-level fault tolerance (NLFT) in distributed realtime systems. The objective of NLFT is to mask errors at the node level in orde...
In this paper we study a fault tolerant model for Grid environments based on the task replication concept. The basic idea is to produce and submit to the Grid multiple replicas of ...