Sciweavers

695 search results - page 25 / 139
» Cache based fault recovery for distributed systems
Sort
View
IPPS
2007
IEEE
14 years 2 months ago
A Framework for Experimental Validation and Performance Evaluation in Fault Tolerant Distributed System
Performing experimental evaluation of fault tolerant distributed systems is a complex and tedious task, and automating as much as possible of the execution and evaluation of exper...
Hein Meling
MIDDLEWARE
2010
Springer
13 years 6 months ago
dFault: Fault Localization in Large-Scale Peer-to-Peer Systems
Distributed hash tables (DHTs) have been adopted as a building block for large-scale distributed systems. The upshot of this success is that their robust operation is even more imp...
Pawan Prakash, Ramana Rao Kompella, Venugopalan Ra...
MICRO
2006
IEEE
88views Hardware» more  MICRO 2006»
13 years 8 months ago
SWICH: A Prototype for Efficient Cache-Level Checkpointing and Rollback
Low-overhead checkpointing and rollback is a popular technique for fault recovery. While different approaches are possible, hardware-supported checkpointing and rollback at the ca...
Radu Teodorescu, Jun Nakano, Josep Torrellas
DAIS
2006
13 years 9 months ago
Using Speculative Push for Unnecessary Checkpoint Creation Avoidance
Abstract. This paper discusses a way of incorporating speculation techniques into Distributed Shared Memory (DSM) systems with checkpointing mechanism without creating unnecessary ...
Arkadiusz Danilecki, Michal Szychowiak
FPGA
2000
ACM
141views FPGA» more  FPGA 2000»
14 years 1 days ago
Tolerating operational faults in cluster-based FPGAs
In recent years the application space of reconfigurable devices has grown to include many platforms with a strong need for fault tolerance. While these systems frequently contain ...
Vijay Lakamraju, Russell Tessier