Sciweavers

695 search results - page 37 / 139
» Cache based fault recovery for distributed systems
Sort
View
USENIX
1996
13 years 9 months ago
Transparent Fault Tolerance for Parallel Applications on Networks of Workstations
This paper describes a new method for providingtransparent fault tolerance for parallel applications on a network of workstations. We have designed our method in the context of sh...
Daniel J. Scales, Monica S. Lam
HPCC
2009
Springer
13 years 6 months ago
Reliability Optimization of Reconfigurable Computing-Based Fault-Tolerant System
Domain-partition (DP) model is a general model for reliability maximization problem under given redundancy. In this paper, an improved DP model is used to formulate a reconfigurati...
Mi Zhou, Lihong Shang, Yu Hu
HPCA
2006
IEEE
14 years 8 months ago
BulletProof: a defect-tolerant CMP switch architecture
As silicon technologies move into the nanometer regime, transistor reliability is expected to wane as devices become subject to extreme process variation, particle-induced transie...
Kypros Constantinides, Stephen Plaza, Jason A. Blo...
PPOPP
2006
ACM
14 years 2 months ago
McRT-STM: a high performance software transactional memory system for a multi-core runtime
Applications need to become more concurrent to take advantage of the increased computational power provided by chip level multiprocessing. Programmers have traditionally managed t...
Bratin Saha, Ali-Reza Adl-Tabatabai, Richard L. Hu...
IPPS
2009
IEEE
14 years 3 months ago
A fusion-based approach for tolerating faults in finite state machines
Given a set of n different deterministic finite state machines (DFSMs) modeling a distributed system, we examine the problem of tolerating f crash or Byzantine faults in such a ...
Vinit A. Ogale, Bharath Balasubramanian, Vijay K. ...