Sciweavers

2400 search results - page 13 / 480
» Systems Failures
Sort
View
IPPS
2009
IEEE
14 years 2 months ago
Robust sequential resource allocation in heterogeneous distributed systems with random compute node failures
—The problem of finding efficient workload distribution techniques is becoming increasingly important today for heterogeneous distributed systems where the availability of comp...
Vladimir Shestak, Edwin K. P. Chong, Anthony A. Ma...
ACSC
2006
IEEE
14 years 1 months ago
Segregated failures model for availability evaluation of fault-tolerant systems
This paper presents a method of estimating the availability of fault-tolerant computer systems with several recovery procedures. A segregated failures model has been proposed rece...
Sergiy A. Vilkomir, David Lorge Parnas, Veena B. M...
IEEEARES
2008
IEEE
14 years 2 months ago
A Lazy Monitoring Approach for Heartbeat-Style Failure Detectors
—Failure detectors are a fundamental part of safe fault-tolerant distributed systems. Many failure detectors use heartbeats to draw conclusions about the state of nodes within a ...
Benjamin Satzger, Andreas Pietzowski, Wolfgang Tru...
CLUSTER
1999
IEEE
13 years 7 months ago
Simulative performance analysis of gossip failure detection for scalable distributed systems
Three protocols for gossip-based failure detection services in large-scale heterogeneous clusters are analyzed and compared. The basic gossip protocol provides a means by which fai...
Mark W. Burns, Alan D. George, Bradley A. Wallace
JPDC
2010
97views more  JPDC 2010»
13 years 6 months ago
Stabilizing leader election in partial synchronous systems with crash failures
This article deals with stabilization and fault-tolerance. We consider two types of stabilization: the self- and the pseudo- stabilization. Our goal is to implement the self- and/...
Carole Delporte-Gallet, Stéphane Devismes, ...