Sciweavers

234 search results - page 19 / 47
» Optimal recovery schemes in fault tolerant distributed compu...
Sort
View
ICPP
1998
IEEE
14 years 21 days ago
Fault-Tolerant Multicasting in Multistage Interconnection Networks
In this paper, we study fault-tolerantmulticastingin multistage interconnection networks (MINs) for constructing large-scale multicomputers. In addition to point-to-point routing ...
Jinsoo Kim, Jaehyung Park, Jung Wan Cho, Hyunsoo Y...
IPPS
2010
IEEE
13 years 6 months ago
Improving the performance of hypervisor-based fault tolerance
Hypervisor-based fault tolerance (HBFT), a checkpoint-recovery mechanism, is an emerging approach to sustaining mission-critical applications. Based on virtualization technology, H...
Jun Zhu, Wei Dong, Zhefu Jiang, Xiaogang Shi, Zhen...
IPPS
2005
IEEE
14 years 2 months ago
Impact of Event Logger on Causal Message Logging Protocols for Fault Tolerant MPI
— Fault tolerance in MPI becomes a main issue in the HPC community. Several approaches are envisioned from user or programmer controlled fault tolerance to fully automatic fault ...
Aurelien Bouteiller, Boris Collin, Thomas Hé...
IPPS
2010
IEEE
13 years 6 months ago
Optimizing RAID for long term data archives
We present new methods to extend data reliability of disks in RAID systems for applications like long term data archival. The proposed solutions extend existing algorithms to detec...
Henning Klein, Jörg Keller
SIGMOD
2004
ACM
151views Database» more  SIGMOD 2004»
14 years 8 months ago
Highly-Available, Fault-Tolerant, Parallel Dataflows
We present a technique that masks failures in a cluster to provide high availability and fault-tolerance for long-running, parallelized dataflows. We can use these dataflows to im...
Mehul A. Shah, Joseph M. Hellerstein, Eric A. Brew...