Sciweavers

7271 search results - page 101 / 1455
» Fault-Tolerant Distributed Simulation
Sort
View
GPC
2007
Springer
14 years 3 months ago
Fault Management in P2P-MPI
We present in this paper the recent developments done in P2P-MPI, a grid middleware, concerning the fault management, which covers fault-tolerance for applications and fault detect...
Stéphane Genaud, Choopan Rattanapoka
IPPS
2006
IEEE
14 years 3 months ago
Coordinated checkpoint from message payload in pessimistic sender-based message logging
Execution of MPI applications on Clusters and Grid deployments suffers from node and network failure that motivates the use of fault tolerant MPI implementations. Two category tec...
M. Aminian, Mohammad K. Akbari, Bahman Javadi
ASWSD
2004
Springer
14 years 2 months ago
On the Fault Hypothesis for a Safety-Critical Real-Time System
– A safety-critical real-time computer system must provide its services with a dependability that is much better than the dependability of any one of its constituent components. ...
Hermann Kopetz
HPDC
2000
IEEE
14 years 1 months ago
Robust Resource Management for Metacomputers
In this paper we present a robust software infrastructure for metacomputing. The system is intended to be used by others as a building block for large and powerful computational g...
Jörn Gehring, Achim Streit
CCGRID
2006
IEEE
14 years 22 days ago
IPMI-based Efficient Notification Framework for Large Scale Cluster Computing
The demand for an efficient fault tolerance system has led to the development of complex monitoring infrastructure, which in turn has created an overwhelming task of data and even...
Chokchai Leangsuksun, Tirumala Rao, Anand Tikoteka...