Sciweavers

695 search results - page 33 / 139
» Cache based fault recovery for distributed systems
Sort
View
NOCS
2007
IEEE
14 years 2 months ago
The Power of Priority: NoC Based Distributed Cache Coherency
The paper introduces Network-on-Chip (NoC) design methodology and low cost mechanisms for supporting efficient cache access and cache coherency in future high-performance Chip Mul...
Evgeny Bolotin, Zvika Guz, Israel Cidon, Ran Ginos...
DSN
2006
IEEE
14 years 2 months ago
Dynamic Verification of Memory Consistency in Cache-Coherent Multithreaded Computer Architectures
—Multithreaded servers with cache-coherent shared memory are the dominant type of machines used to run critical network services and database management systems. To achieve the h...
Albert Meixner, Daniel J. Sorin
SRDS
2000
IEEE
14 years 25 days ago
Dynamic Node Management and Measure Estimation in a State-Driven Fault Injector
Validation of distributed systems using fault injection is difficult because of their inherent complexity, lack of a global clock, and lack of an easily accessible notion of a gl...
Ramesh Chandra, Michel Cukier, Ryan M. Lefever, Wi...
ICDCS
1997
IEEE
14 years 20 days ago
Distributed Recovery with K-Optimistic Logging
Fault-tolerance techniques based on checkpointing and message logging have been increasingly used in real-world applications to reduce service down-time. Most industrial applicati...
Yi-Min Wang, Om P. Damani, Vijay K. Garg
CCGRID
2006
IEEE
14 years 2 months ago
Proposal of MPI Operation Level Checkpoint/Rollback and One Implementation
With the increasing number of processors in modern HPC(High Performance Computing) systems, there are two emergent problems to solve. One is scalability, the other is fault tolera...
Yuan Tang, Graham E. Fagg, Jack Dongarra