Sciweavers

1166 search results - page 5 / 234
» Crash Management for Distributed Parallel Systems
Sort
View
PODC
1992
ACM
13 years 11 months ago
The Weakest Failure Detector for Solving Consensus
We determine what information about failures is necessary and sufficient to solve Consensus in asynchronous distributed systems subject to crash failures. In Chandra and Toueg [199...
Tushar Deepak Chandra, Vassos Hadzilacos, Sam Toue...
ASAP
2008
IEEE
142views Hardware» more  ASAP 2008»
14 years 1 months ago
Managing multi-core soft-error reliability through utility-driven cross domain optimization
As semiconductor processing technology continues to scale down, managing reliability becomes an increasingly difficult challenge in high-performance microprocessor design. Transie...
Wangyuan Zhang, Tao Li
PODC
2012
ACM
11 years 9 months ago
Asynchronous failure detectors
Failure detectors — oracles that provide information about process crashes — are an important ion for crash tolerance in distributed systems. Although current failure-detector...
Alejandro Cornejo, Nancy A. Lynch, Srikanth Sastry
JSSPP
1995
Springer
13 years 11 months ago
Job Management Requirements for NAS Parallel Systems and Clusters
William Saphir, Leigh Ann Tanner, Bernard Traversa...