Sciweavers

115 search results - page 6 / 23
» Transparent Fault Tolerance for Parallel Applications on Net...
Sort
View
EUROMICRO
1996
IEEE
13 years 11 months ago
Experience of Adaptive Replication in Distributed File Systems
Replication is a key strategy for improving locality, fault tolerance and availability in distributed systems. The paper focuses on distributed file systems and presents a system ...
Giacomo Cabri, Antonio Corradi, Franco Zambonelli
SSS
2007
Springer
117views Control Systems» more  SSS 2007»
14 years 1 months ago
Distributed Synthesis of Fault-Tolerant Programs in the High Atomicity Model
In this paper, we concentrate on distributed algorithms for automated synthesis of fault-tolerant programs in the high atomicity model, where all processes can read and write all p...
Borzoo Bonakdarpour, Sandeep S. Kulkarni, Fuad Abu...
HPDC
1994
IEEE
13 years 11 months ago
Network Partitioning of Data Parallel Computations
Partitioning data parallel computations across a network of heterogeneous workstations is a dificult problem for the user: We have developed a runtime partitioning methodfor choos...
Jon B. Weissman, Andrew S. Grimshaw
IPPS
2000
IEEE
14 years 1 days ago
FANTOMAS: Fault Tolerance for Mobile Agents in Clusters
Abstract. To achieve an efficient utilization of cluster systems, a proper programming and operating environment is required. In this context, mobile agents are of growing interes...
Holger Pals, Stefan Petri, Claus Grewe
SIGMETRICS
2010
ACM
201views Hardware» more  SIGMETRICS 2010»
14 years 14 days ago
Transparent, lightweight application execution replay on commodity multiprocessor operating systems
We present S, the first system to provide transparent, lowoverhead application record-replay and the ability to go live from replayed execution. S i...
Oren Laadan, Nicolas Viennot, Jason Nieh