Sciweavers

113 search results - page 12 / 23
» Tolerating Client and Communication Failures in Distributed ...
Sort
View
SSS
2005
Springer
119views Control Systems» more  SSS 2005»
14 years 1 months ago
Self-stabilization of Byzantine Protocols
Awareness of the need for robustness in distributed systems increases as distributed systems become integral parts of day-to-day systems. Self-stabilizing while tolerating ongoing ...
Ariel Daliot, Danny Dolev
SOSP
2007
ACM
14 years 4 months ago
Sinfonia: a new paradigm for building scalable distributed systems
We propose a new paradigm for building scalable distributed systems. Our approach does not require dealing with message-passing protocols—a major complication in existing distri...
Marcos Kawazoe Aguilera, Arif Merchant, Mehul A. S...
INFOCOM
2006
IEEE
14 years 1 months ago
Data Synchronization Methods Based on ShuffleNet and Hypercube for Networked Information Systems
– In contrast to a typical single source of data updates in Internet applications, data files in a networked information system are often distributed, replicated, accessed and up...
David J. Houck, Kin K. Leung, Peter Winkler
SOSP
2001
ACM
14 years 4 months ago
BASE: Using Abstraction to Improve Fault Tolerance
ing Abstraction to Improve Fault Tolerance MIGUEL CASTRO Microsoft Research and RODRIGO RODRIGUES and BARBARA LISKOV MIT Laboratory for Computer Science Software errors are a major...
Rodrigo Rodrigues, Miguel Castro, Barbara Liskov
ICDCS
2012
IEEE
11 years 10 months ago
Combining Partial Redundancy and Checkpointing for HPC
Today’s largest High Performance Computing (HPC) systems exceed one Petaflops (1015 floating point operations per second) and exascale systems are projected within seven years...
James Elliott, Kishor Kharbas, David Fiala, Frank ...