Sciweavers

234 search results - page 17 / 47
» Implementation of Fault-Tolerant GridRPC Applications
Sort
View
DSN
2003
IEEE
15 years 11 months ago
Engineering Fault-Tolerant TCP/IP Servers Using FT-TCP
In a recent paper [2] we have proposed FT-TCP: an architecture that allows a replicated service to survive crashes without breaking its TCP connections. FT-TCP is attractive in pr...
Dmitrii Zagorodnov, Keith Marzullo, Lorenzo Alvisi...
HPCA
2003
IEEE
16 years 6 months ago
Dynamic Data Replication: An Approach to Providing Fault-Tolerant Shared Memory Clusters
A challenging issue in today's server systems is to transparently deal with failures and application-imposed requirements for continuous operation. In this paper we address t...
Rosalia Christodoulopoulou, Reza Azimi, Angelos Bi...
DEXAW
2004
IEEE
132views Database» more  DEXAW 2004»
15 years 10 months ago
Using Data-Flow Analysis for Resilience and Result Checking in Peer-To-Peer Computations
To achieve correct execution of peer-to-peer applications on non-reliable resources, we present a portable and distributed algorithm that provides fault tolerance and result checki...
Samir Jafar, Sébastien Varrette, Jean-Louis...
FGCS
2008
140views more  FGCS 2008»
15 years 6 months ago
Blocking vs. non-blocking coordinated checkpointing for large-scale fault tolerant MPI Protocols
A long-term trend in high-performance computing is the increasing number of nodes in parallel computing platforms, which entails a higher failure probability. Fault tolerant progr...
Darius Buntinas, Camille Coti, Thomas Hérau...
IPPS
2007
IEEE
16 years 15 days ago
A Fault Tolerance Protocol with Fast Fault Recovery
Fault tolerance is an important issue for large machines with tens or hundreds of thousands of processors. Checkpoint-based methods, currently used on most machines, rollback all ...
Sayantan Chakravorty, Laxmikant V. Kalé