Sciweavers

35 search results - page 5 / 7
» Distributed Recovery with K-Optimistic Logging
Sort
View
IPPS
2005
IEEE
14 years 29 days ago
Impact of Event Logger on Causal Message Logging Protocols for Fault Tolerant MPI
— Fault tolerance in MPI becomes a main issue in the HPC community. Several approaches are envisioned from user or programmer controlled fault tolerance to fully automatic fault ...
Aurelien Bouteiller, Boris Collin, Thomas Hé...
CCS
2009
ACM
14 years 8 months ago
Logging key assurance indicators in business processes
Management of a modern enterprise is based on the assumption that executive reports of lower-layer management are faithful to what is actually happening in the field. As some well...
Fabio Massacci, Gene Tsudik, Artsiom Yautsiukhin
COMPSAC
2003
IEEE
14 years 20 days ago
Protecting Distributed Software Upgrades that Involve Message-Passing Interface Changes
We present in this paper an extension of the messagedriven confidence-driven framework that we developed for onboard guarded software upgrading. The purpose of this work is to pr...
Ann T. Tai, Kam S. Tso, William H. Sanders
IPPS
2010
IEEE
13 years 5 months ago
Scalable failure recovery for high-performance data aggregation
Many high-performance tools, applications and infrastructures, such as Paradyn, STAT, TAU, Ganglia, SuperMon, Astrolabe, Borealis, and MRNet, use data aggregation to synthesize lar...
Dorian C. Arnold, Barton P. Miller
CLUSTER
2004
IEEE
13 years 7 months ago
MPI/FT: A Model-Based Approach to Low-Overhead Fault Tolerant Message-Passing Middleware
Fault tolerance in parallel systems has traditionally been achieved through a combination of redundancy and checkpointing methods. This notion has also been extended to message-pas...
Rajanikanth Batchu, Yoginder S. Dandass, Anthony S...