Sciweavers

204 search results - page 8 / 41
» Fault-tolerant solutions for a MPI compute intensive applica...
Sort
View
PVM
2010
Springer
13 years 5 months ago
Dodging the Cost of Unavoidable Memory Copies in Message Logging Protocols
Abstract. With the number of computing elements spiraling to hundred of thousands in modern HPC systems, failures are common events. Few applications are nevertheless fault toleran...
George Bosilca, Aurelien Bouteiller, Thomas H&eacu...
EUMAS
2006
13 years 9 months ago
DimaX: A Fault-Tolerant Multi-Agent Platform
Fault tolerance is an important property of large-scale multiagent systems as the failure rate grows with both the number of the hosts and deployed agents, and the duration of com...
Nora Faci, Zahia Guessoum, Olivier Marin
ICDCS
2007
IEEE
14 years 1 months ago
Protocol Design and Optimization for Delay/Fault-Tolerant Mobile Sensor Networks
While extensive studies have been carried out in the past several years for many sensor applications, they cannot be applied to the network with extremely low and intermittent con...
Yu Wang, Hongyi Wu, Feng Lin, Nian-Feng Tzeng
IPPS
2006
IEEE
14 years 1 months ago
Coordinated checkpoint from message payload in pessimistic sender-based message logging
Execution of MPI applications on Clusters and Grid deployments suffers from node and network failure that motivates the use of fault tolerant MPI implementations. Two category tec...
M. Aminian, Mohammad K. Akbari, Bahman Javadi
HPDC
2008
IEEE
14 years 1 months ago
DataLab: transactional data-parallel computing on an active storage cloud
Active storage clouds are an attractive platform for executing large data intensive workloads found in many fields of science. However, active storage presents new system managem...
Brandon Rich, Douglas Thain