Sciweavers

WOSS
2004
ACM
14 years 1 months ago
Design of self-managing dependable systems with UML and fault tolerance patterns
The development of dependable software systems is a costly undertaking. Fault tolerance techniques as well as self-repair capabilities usually result in additional system complexi...
Matthias Tichy, Daniela Schilling, Holger Giese
IPTPS
2005
Springer
14 years 1 months ago
Practical Locality-Awareness for Large Scale Information Sharing
Tulip is an overlay for routing, searching and publish-lookup information sharing. It offers a unique combination of the advantages of both structured and unstructured overlays, t...
Ittai Abraham, Ankur Badola, Danny Bickson, Dahlia...
GI
2005
Springer
14 years 1 months ago
On the Coverage of Proactive Security: An Addition to the Taxonomy of Faults
: Intrusion tolerance is a recent approach to deal with intentional and malicious failures. It combines the research on fault tolerance with the research on security, and relies on...
Timo Warns
ADAEUROPE
2005
Springer
14 years 1 months ago
Non-intrusive System Level Fault-Tolerance
This paper describes the methodology used to add nonintrusive system-level fault tolerance to an electronic throttle controller. The original model of the throttle controller is a...
Kristina Lundqvist, Jayakanth Srinivasan, Sé...
SC
2005
ACM
14 years 1 months ago
Transparent, Incremental Checkpointing at Kernel Level: a Foundation for Fault Tolerance for Parallel Computers
We describe the software architecture, technical features, and performance of TICK (Transparent Incremental Checkpointer at Kernel level), a system-level checkpointer implemented ...
Roberto Gioiosa, José Carlos Sancho, Song J...
SAC
2005
ACM
14 years 1 months ago
An agent model for fault-tolerant systems
This paper describes the use of fault tolerance in a multiagent system. Such an approach is based on the modeling of autonomous agents with planning capabilities. These capabiliti...
Avelino F. Zorzo, Felipe Rech Meneguzzi
ISPAN
2005
IEEE
14 years 1 months ago
Coordinated Robust Routing by Dual Cluster Heads in Layered Wireless Sensor Networks
In this paper, we propose the coordinated robust routing (CRR) scheme to address the fault tolerance requirements in the layered wireless sensor networks. In the proposed scheme, ...
Mei Yang, Jianping Wang, Zhen-guo Gao, Yingtao Jia...
IPPS
2005
IEEE
14 years 1 months ago
Current Practice and a Direction Forward in Checkpoint/Restart Implementations for Fault Tolerance
Checkpoint/restart is a general idea for which particular implementations enable various functionalities in computer systems, including process migration, gang scheduling, hiberna...
José Carlos Sancho, Fabrizio Petrini, Kei D...
IPPS
2005
IEEE
14 years 1 months ago
Combining FT-MPI with H2O: Fault-Tolerant MPI Across Administrative Boundaries
We observe increasing interest in aggregating geographically distributed, heterogeneous resources to perform large scale computations. MPI remains the most popular programming par...
Dawid Kurzyniec, Vaidy S. Sunderam
IPPS
2005
IEEE
14 years 1 months ago
Impact of Event Logger on Causal Message Logging Protocols for Fault Tolerant MPI
— Fault tolerance in MPI becomes a main issue in the HPC community. Several approaches are envisioned from user or programmer controlled fault tolerance to fully automatic fault ...
Aurelien Bouteiller, Boris Collin, Thomas Hé...