Sciweavers

698 search results - page 85 / 140
» Synthesis of Fault-Tolerant Distributed Systems
Sort
View
PDCAT
2005
Springer
14 years 1 months ago
A New Algorithm to Solve Synchronous Consensus for Dependent Failures
Fault tolerant algorithms are often designed under the t-out-of-n assumption, which is based on the assumption that all processes or components fail independently with equal proba...
Jun Wang, Min Song
OPODIS
2008
13 years 9 months ago
Byzantine Consensus with Unknown Participants
Abstract. Consensus is a fundamental building block used to solve many practical problems that appear on reliable distributed systems. In spite of the fact that consensus is being ...
Eduardo Adílio Pelinson Alchieri, Alysson N...
IEEEHPCS
2010
13 years 5 months ago
Using replication and checkpointing for reliable task management in computational Grids
In grid computing systems, providing fault-tolerance is required for both scientific computation and file-sharing to increase their reliability. In previous works, several mechani...
Sangho Yi, Derrick Kondo, Bongjae Kim, Geunyoung P...
CASCON
1996
126views Education» more  CASCON 1996»
13 years 9 months ago
Evaluating the costs of management: a distributed applications management testbed
In today's distributed computing environments, users are makingincreasing demands on the systems, networks, and applications they use. Users are coming to expect performance,...
Michael Katchabaw, Stephen L. Howard, Andrew D. Ma...
HPDC
2009
IEEE
14 years 2 months ago
Interconnect agnostic checkpoint/restart in open MPI
Long running High Performance Computing (HPC) applications at scale must be able to tolerate inevitable faults if they are to harness current and future HPC systems. Message Passi...
Joshua Hursey, Timothy Mattox, Andrew Lumsdaine