Sciweavers

695 search results - page 9 / 139
» Cache based fault recovery for distributed systems
Sort
View
DSN
2000
IEEE
13 years 11 months ago
Loki: A State-Driven Fault Injector for Distributed Systems
Distributed applications can fail in subtle ways that depend on the state of multiple parts of a system. This complicates the validation of such systems via fault injection, since...
Ramesh Chandra, Ryan M. Lefever, Michel Cukier, Wi...
SRDS
1999
IEEE
13 years 11 months ago
Fault Injection based on a Partial View of the Global State of a Distributed System
Validating distributed systems is particularly difficult, since failures may occur due to a correlated occurrence of faults in different parts of the system. This paper describes ...
Michel Cukier, Ramesh Chandra, David Henke, Jessic...
ATAL
2009
Springer
14 years 1 months ago
Combining fault injection and model checking to verify fault tolerance in multi-agent systems
The ability to guarantee that a system will continue to operate correctly under degraded conditions is key to the success of adopting multi-agent systems (MAS) as a paradigm for d...
Jonathan Ezekiel, Alessio Lomuscio
DSN
2003
IEEE
14 years 3 hour ago
Integrating Recovery Strategies into a Primary Substation Automation System
The DepAuDE architecture provides middleware to integrate fault tolerance support into distributed embedded automation applications. It allows error recovery to be expressed in te...
Geert Deconinck, Vincenzo De Florio, Ronnie Belman...
CCGRID
2010
IEEE
13 years 7 months ago
Selective Recovery from Failures in a Task Parallel Programming Model
Abstract--We present a fault tolerant task pool execution environment that is capable of performing fine-grain selective restart using a lightweight, distributed task completion tr...
James Dinan, Arjun Singri, P. Sadayappan, Sriram K...