Sciweavers

511 search results - page 76 / 103
» A Model for Space-Correlated Failures in Large-Scale Distrib...
Sort
View
MODELS
2009
Springer
14 years 3 months ago
An Incremental Algorithm for High-Performance Runtime Model Consistency
We present a novel technique for applying two-level runtime models to distributed systems. Our approach uses graph rewriting rules to transform a high-level source model into one o...
Christopher Wolfe, T. C. Nicholas Graham, W. Greg ...
CLUSTER
2011
IEEE
12 years 8 months ago
Dynamic Load Balance for Optimized Message Logging in Fault Tolerant HPC Applications
—Computing systems will grow significantly larger in the near future to satisfy the needs of computational scientists in areas like climate modeling, biophysics and cosmology. S...
Esteban Meneses, Laxmikant V. Kalé, Greg Br...
PDP
1996
IEEE
14 years 20 days ago
Application-Dependent Performability Evaluation of Fault-Tolerant Multiprocessors
A case study of performance and dependability evaluation of fault-tolerant multiprocessors is presented. Two specific architectures are analyzed taking into account system functio...
Stefan Dalibor, A. Hein, Wolfgang Hohl
CLOUDCOM
2009
Springer
13 years 12 months ago
Decentralized Service Allocation in a Broker Overlay Based Grid
Abstract. Grid computing is based on coordinated resource sharing in a dynamic environment of multi-institutional virtual organizations. Data exchanges, and service allocation, are...
Abdulrahman Azab, Hein Meling
DSN
2004
IEEE
14 years 7 days ago
Fault Tolerance Tradeoffs in Moving from Decentralized to Centralized Embedded Systems
Some safety-critical distributed embedded systems may need to use centralized components to achieve certain dependability properties. The difficulty in combining centralized and d...
Jennifer Morris, Daniel Kroening, Philip Koopman