Sciweavers

668 search results - page 3 / 134
» Implementing and Evaluating Automatic Checkpointing
Sort
View
MICRO
2006
IEEE
88views Hardware» more  MICRO 2006»
13 years 6 months ago
SWICH: A Prototype for Efficient Cache-Level Checkpointing and Rollback
Low-overhead checkpointing and rollback is a popular technique for fault recovery. While different approaches are possible, hardware-supported checkpointing and rollback at the ca...
Radu Teodorescu, Jun Nakano, Josep Torrellas
CLUSTER
2005
IEEE
14 years 6 days ago
Minimizing the Network Overhead of Checkpointing in Cycle-harvesting Cluster Environments
Cycle-harvesting systems such as Condor have been developed to make desktop machines in a local area (which are often similar to clusters in hardware configuration) available as ...
Daniel Nurmi, John Brevik, Richard Wolski
VEE
2006
ACM
126views Virtualization» more  VEE 2006»
14 years 16 days ago
A new approach to real-time checkpointing
The progress towards programming methodologies that simplify the work of the programmer involves automating, whenever possible, activities that are secondary to the main task of d...
Antonio Cunei, Jan Vitek
NOMS
2010
IEEE
201views Communications» more  NOMS 2010»
13 years 4 months ago
Checkpoint-based fault-tolerant infrastructure for virtualized service providers
Crash and omission failures are common in service providers: a disk can break down or a link can fail anytime. In addition, the probability of a node failure increases with the num...
Iñigo Goiri, Ferran Julià, Jordi Gui...
CLUSTER
2004
IEEE
13 years 10 months ago
Improved message logging versus improved coordinated checkpointing for fault tolerant MPI
Fault tolerance is a very important concern for critical high performance applications using the MPI library. Several protocols provide automatic and transparent fault detection a...
Pierre Lemarinier, Aurelien Bouteiller, Thomas H&e...