Search Sciweavers | Sciweavers

29

CLUSTER
2004
IEEE

103views Distributed And Parallel Com...» more CLUSTER 2004»

MPI/FT: A Model-Based Approach to Low-Overhead Fault Tolerant Message-Passing Middleware

13 years 7 months ago

Fault tolerance in parallel systems has traditionally been achieved through a combination of redundancy and checkpointing methods. This notion has also been extended to message-pas...

Rajanikanth Batchu, Yoginder S. Dandass, Anthony S...

claim paper

Read More »

24

click to vote

ISPA
2007
Springer

144views Distributed And Parallel Com...» more ISPA 2007»

Binomial Graph: A Scalable and Fault-Tolerant Logical Network Topology

14 years 1 months ago

Download icl.cs.utk.edu

The number of processors embedded in high performance computing platforms is growing daily to solve larger and more complex problems. The logical network topologies must also suppo...

Thara Angskun, George Bosilca, Jack Dongarra

claim paper

Read More »

24

click to vote

IPPS
2007
IEEE

129views Distributed And Parallel Com...» more IPPS 2007»

A Fault Tolerance Protocol with Fast Fault Recovery

14 years 2 months ago

Download www.cecs.uci.edu

Fault tolerance is an important issue for large machines with tens or hundreds of thousands of processors. Checkpoint-based methods, currently used on most machines, rollback all ...

Sayantan Chakravorty, Laxmikant V. Kalé

claim paper

Read More »

21

click to vote

ICPP
2000
IEEE

139views Distributed And Parallel Com...» more ICPP 2000»

A Problem-Specific Fault-Tolerance Mechanism for Asynchronous, Distributed Systems

13 years 11 months ago

Download www.globus.org

The idle computers on a local area, campus area, or even wide area network represent a significant computational resource--one that is, however, also unreliable, heterogeneous, an...

Adriana Iamnitchi, Ian T. Foster

claim paper

Read More »

26

click to vote

IPPS
2005
IEEE

154views Distributed And Parallel Com...» more IPPS 2005»

Fault-Tolerant Parallel Applications with Dynamic Parallel Schedules

14 years 1 months ago

Download dps.epfl.ch

Commodity computer clusters are often composed of hundreds of computing nodes. These generally off-the-shelf systems are not designed for high reliability. Node failures therefore...

Sebastian Gerlach, Roger D. Hersch

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers