Sciweavers

295 search results - page 14 / 59
» Invariants Based Failure Diagnosis in Distributed Computing ...
Sort
View
DSN
2003
IEEE
14 years 26 days ago
Comparison of Failure Detectors and Group Membership: Performance Study of Two Atomic Broadcast Algorithms
Protocols that solve agreement problems are essential building blocks for fault tolerant distributed systems. While many protocols have been published, little has been done to ana...
Péter Urbán, Ilya Shnayderman, Andr&...
PODC
1994
ACM
13 years 11 months ago
A Checkpoint Protocol for an Entry Consistent Shared Memory System
Workstation clusters are becoming an interesting alternative to dedicated multiprocessors. In this environment, the probability of a failure, during an application's executio...
Nuno Neves, Miguel Castro, Paulo Guedes
IEEEAMS
2003
IEEE
14 years 25 days ago
On Conditions for Self-Healing in Distributed Software Systems
This paper attempts to identify one of the necessary conditions for self-healing, or self-repair, in complex systems, and to propose means for satisfying this condition in heterog...
Naftaly H. Minsky
HYBRID
2003
Springer
14 years 23 days ago
Estimation of Distributed Hybrid Systems Using Particle Filtering Methods
Abstract. Networked embedded systems are composed of a large number of components that interact with the physical world via a set of sensors and actuators, have their own computati...
Xenofon D. Koutsoukos, James Kurien, Feng Zhao
MOBISYS
2007
ACM
14 years 7 months ago
NodeMD: diagnosing node-level faults in remote wireless sensor systems
Software failures in wireless sensor systems are notoriously difficult to debug. Resource constraints in wireless deployments substantially restrict visibility into the root cause...
Veljko Krunic, Eric Trumpler, Richard Han