Sciweavers

354 search results - page 6 / 71
» Self Adaptive Application Level Fault Tolerance for Parallel...
Sort
View
IPPS
2007
IEEE
14 years 2 months ago
A Fault Tolerance Protocol with Fast Fault Recovery
Fault tolerance is an important issue for large machines with tens or hundreds of thousands of processors. Checkpoint-based methods, currently used on most machines, rollback all ...
Sayantan Chakravorty, Laxmikant V. Kalé
CLUSTER
2005
IEEE
14 years 2 months ago
Job-Site Level Fault Tolerance for Cluster and Grid environments
Kshitij Limaye, Box Leangsuksun, Zeno Greenwood, S...
IPPS
2005
IEEE
14 years 2 months ago
Combining FT-MPI with H2O: Fault-Tolerant MPI Across Administrative Boundaries
We observe increasing interest in aggregating geographically distributed, heterogeneous resources to perform large scale computations. MPI remains the most popular programming par...
Dawid Kurzyniec, Vaidy S. Sunderam
IPPS
1998
IEEE
14 years 25 days ago
Hyper Butterfly Network: A Scalable Optimally Fault Tolerant Architecture
Boundeddegreenetworks like deBruijn graphsor wrapped butterfly networks are very important from VLSI implementation point of view as well as for applications where the computing n...
Wei Shi, Pradip K. Srimani