Sciweavers

402 search results - page 10 / 81
» Fault-tolerance in the Borealis distributed stream processin...
Sort
View
PDCN
2007
13 years 8 months ago
A new robust centralized DMX algorithm
In a distributed system, process synchronization is an important agenda. One of the major duties for process synchronization is mutual exclusion. This paper presents a new central...
Moharram Challenger, Vahid Khalilpour, Peyman Baya...
SOSP
2007
ACM
14 years 4 months ago
Zyzzyva: speculative byzantine fault tolerance
We present Zyzzyva, a protocol that uses speculation to reduce the cost and simplify the design of Byzantine fault tolerant state machine replication. In Zyzzyva, replicas respond...
Ramakrishna Kotla, Lorenzo Alvisi, Michael Dahlin,...
IPPS
2005
IEEE
14 years 29 days ago
Current Practice and a Direction Forward in Checkpoint/Restart Implementations for Fault Tolerance
Checkpoint/restart is a general idea for which particular implementations enable various functionalities in computer systems, including process migration, gang scheduling, hiberna...
José Carlos Sancho, Fabrizio Petrini, Kei D...
PVM
2005
Springer
14 years 26 days ago
Scalable Fault Tolerant MPI: Extending the Recovery Algorithm
ct Fault Tolerant MPI (FT-MPI)[6] was designed as a solution to allow applications different methods to handle process failures beyond simple check-point restart schemes. The init...
Graham E. Fagg, Thara Angskun, George Bosilca, Jel...
APAQS
2001
IEEE
13 years 11 months ago
Incremental Fault-Tolerant Design in an Object-Oriented Setting
With the increasing emphasis on dependability in complex, distributed systems, it is essential that system development can be done gradually and at different levels of detail. In ...
Einar Broch Johnsen, Olaf Owe, Ellen Munthe-Kaas, ...