Sciweavers

295 search results - page 28 / 59
» Invariants Based Failure Diagnosis in Distributed Computing ...
Sort
View
HCW
2000
IEEE
14 years 1 days ago
Reliable Cluster Computing with a New Checkpointing RAID-x Architecture
In a serverless cluster of PCs or workstations, the cluster must allow remote file accesses or parallel I/O directly performed over disks distributed to all client nodes. We intro...
Kai Hwang, Hai Jin, Roy S. C. Ho, Wonwoo Ro
ISORC
2009
IEEE
14 years 2 months ago
Fault-Tolerance for Component-Based Systems - An Automated Middleware Specialization Approach
General-purpose middleware, by definition, cannot readily support domain-specific semantics without significant manual efforts in specializing the middleware. This paper prese...
Sumant Tambe, Akshay Dabholkar, Aniruddha S. Gokha...
DCOSS
2009
Springer
14 years 2 months ago
Finding Symbolic Bug Patterns in Sensor Networks
Abstract. This paper presents a failure diagnosis algorithm for summarizing and generalizing patterns that lead to instances of anomalous behavior in sensor networks. Often multipl...
Mohammad Maifi Hasan Khan, Tarek F. Abdelzaher, Ji...
ICDCS
1997
IEEE
13 years 12 months ago
Distributed Recovery with K-Optimistic Logging
Fault-tolerance techniques based on checkpointing and message logging have been increasingly used in real-world applications to reduce service down-time. Most industrial applicati...
Yi-Min Wang, Om P. Damani, Vijay K. Garg
PODC
1990
ACM
13 years 11 months ago
Sharing Memory Robustly in Message-Passing Systems
Emulators that translate algorithms from the shared-memory model to two different message-passing models are presented. Both are achieved by implementing a wait-free, atomic, singl...
Hagit Attiya, Amotz Bar-Noy, Danny Dolev