Sciweavers

2226 search results - page 21 / 446
» Fault-Tolerant Parallel Applications with Dynamic Parallel S...
Sort
View
IPPS
2007
IEEE
14 years 1 months ago
A Fault Tolerance Protocol with Fast Fault Recovery
Fault tolerance is an important issue for large machines with tens or hundreds of thousands of processors. Checkpoint-based methods, currently used on most machines, rollback all ...
Sayantan Chakravorty, Laxmikant V. Kalé
HIPC
2000
Springer
13 years 11 months ago
Applying Patterns to Improve the Performance of Fault Tolerant CORBA
An increasing number of mission-critical, embedded, telecommunications, and financial distributed systems are being developed using distributed object computing middleware, such a...
Balachandran Natarajan, Aniruddha S. Gokhale, Shal...
CCGRID
2008
IEEE
13 years 7 months ago
Fault Tolerance and Recovery of Scientific Workflows on Computational Grids
In this paper, we describe the design and implementation of two mechanisms for fault-tolerance and recovery for complex scientific workflows on computational grids. We present our ...
Gopi Kandaswamy, Anirban Mandal, Daniel A. Reed
EDOC
2005
IEEE
14 years 1 months ago
FTWeb: A Fault Tolerant Infrastructure for Web Services
The web services architecture came as answers to the search for interoperability among applications. In recent years there has been a growing interest in deploying on the Internet...
Giuliana Teixeira Santos, Lau Cheuk Lung, Carlos M...
CLUSTER
2004
IEEE
13 years 11 months ago
FTC-Charm++: an in-memory checkpoint-based fault tolerant runtime for Charm++ and MPI
As high performance clusters continue to grow in size, the mean time between failure shrinks. Thus, the issues of fault tolerance and reliability are becoming one of the challengi...
Gengbin Zheng, Lixia Shi, Laxmikant V. Kalé