Search Sciweavers | Sciweavers

1256 search results - page 11 / 252

» On Coordinated Checkpointing in Distributed Systems

click to vote

HIPC
2007
Springer

133views Distributed And Parallel Com...» more HIPC 2007»

A Scalable Asynchronous Replication-Based Strategy for Fault Tolerant MPI Applications

14 years 2 months ago

Download www.cse.buffalo.edu

As computational clusters increase in size, their mean-time-to-failure reduces. Typically checkpointing is used to minimize the loss of computation. Most checkpointing techniques, ...

John Paul Walters, Vipin Chaudhary

claim paper

Read More »

click to vote

IPPS
2009
IEEE

116views Distributed And Parallel Com...» more IPPS 2009»

DMTCP: Transparent checkpointing for cluster computations and the desktop

14 years 3 months ago

Download dmtcp.sourceforge.net

DMTCP (Distributed MultiThreaded CheckPointing) is a transparent user-level checkpointing package for distributed applications. Checkpointing and restart is demonstrated for a wid...

Jason Ansel, Kapil Arya, Gene Cooperman

claim paper

Read More »

click to vote

AP2PS
2009
IEEE

240views Computer Networks» more AP2PS 2009»

Algorithm-Based Fault Tolerance Applied to P2P Computing Networks

14 years 3 days ago

Download moais.imag.fr

—P2P computing platforms are subject to a wide range of attacks. In this paper, we propose a generalisation of the previous disk-less checkpointing approach for fault-tolerance i...

Thomas Roche, Mathieu Cunche, Jean-Louis Roch

claim paper

Read More »

click to vote

IPPS
1996
IEEE

175views Distributed And Parallel Com...» more IPPS 1996»

CoCheck: Checkpointing and Process Migration for MPI

14 years 28 days ago

Download www.ece.rutgers.edu

Checkpointing of parallel applications can be used as the core technology to provide process migration. Both, checkpointing and migration, are an important issue for parallel appl...

Georg Stellner

claim paper

Read More »

click to vote

DSN
2005
IEEE

110views Computer Networks» more DSN 2005»

Cruz: Application-Transparent Distributed Checkpoint-Restart on Standard Operating Systems

14 years 2 months ago

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers