Sciweavers

342 search results - page 5 / 69
» A planning based approach to failure recovery in distributed...
Sort
View
TPDS
1998
135views more  TPDS 1998»
13 years 7 months ago
On Coordinated Checkpointing in Distributed Systems
—Coordinated checkpointing simplifies failure recovery and eliminates domino effects in case of failures by preserving a consistent global checkpoint on stable storage. However, ...
Guohong Cao, Mukesh Singhal
HPDC
2000
IEEE
13 years 12 months ago
Failure-Atomic File Access in an Interposed Network Storage System
This paper presents a recovery protocol for block I/O operations in Slice, a storage system architecture for highspeed LANs incorporating network-attached block storage. The goal ...
Darrell C. Anderson, Jeffrey S. Chase
WOSS
2004
ACM
14 years 27 days ago
Combining statistical monitoring and predictable recovery for self-management
Complex distributed Internet services form the basis not only of e-commerce but increasingly of mission-critical networkbased applications. What is new is that the workload and in...
Armando Fox, Emre Kiciman, David A. Patterson
FTCS
1993
97views more  FTCS 1993»
13 years 8 months ago
Virtually-Synchronous Communication Based on a Weak Failure Suspector
Failure detectors (or, more accurately Failure Suspectors { FS) appear to be a fundamental service upon which to build fault-tolerant, distributed applications. This paper shows t...
André Schiper, Aleta Ricciardi
IJCAI
1989
13 years 8 months ago
Using and Refining Simplifications: Explanation-Based Learning of Plans in Intractable Domains
This paper describes an explanation-based approach lo learning plans despite a computationally intractable domain theory. In this approach, the system learns an initial plan using...
Steve A. Chien