Search Sciweavers | Sciweavers

5 search results - page 1 / 1

» Hybrid Checkpointing for MPI Jobs in HPC Environments

click to vote

ICPADS
2010
IEEE

169views Distributed And Parallel Com...» more ICPADS 2010»

Hybrid Checkpointing for MPI Jobs in HPC Environments

13 years 7 months ago

Download moss.csc.ncsu.edu

As the core count in high-performance computing systems keeps increasing, faults are becoming common place. Checkpointing addresses such faults but captures full process images ev...

Chao Wang, Frank Mueller, Christian Engelmann, Ste...

claim paper

Read More »

click to vote

HPDC
2009
IEEE

101views Distributed And Parallel Com...» more HPDC 2009»

Interconnect agnostic checkpoint/restart in open MPI

14 years 4 months ago

Download www.osl.iu.edu

Long running High Performance Computing (HPC) applications at scale must be able to tolerate inevitable faults if they are to harness current and future HPC systems. Message Passi...

Joshua Hursey, Timothy Mattox, Andrew Lumsdaine

claim paper

Read More »

click to vote

ICDCS
2012
IEEE

238views Distributed And Parallel Com...» more ICDCS 2012»

Combining Partial Redundancy and Checkpointing for HPC

12 years 5 days ago

Download moss.csc.ncsu.edu

Today’s largest High Performance Computing (HPC) systems exceed one Petaﬂops (1015 ﬂoating point operations per second) and exascale systems are projected within seven years...

James Elliott, Kishor Kharbas, David Fiala, Frank ...

claim paper

Read More »

click to vote

CCGRID
2006
IEEE

131views Distributed And Parallel Com...» more CCGRID 2006»

Proposal of MPI Operation Level Checkpoint/Rollback and One Implementation

14 years 3 months ago

Download icl.cs.utk.edu

With the increasing number of processors in modern HPC(High Performance Computing) systems, there are two emergent problems to solve. One is scalability, the other is fault tolera...

Yuan Tang, Graham E. Fagg, Jack Dongarra

claim paper

Read More »

click to vote

GRID
2004
Springer

102views Distributed And Parallel Com...» more GRID 2004»

Hybrid Preemptive Scheduling of MPI Applications on the Grids

14 years 3 months ago

Download www.cs.utk.edu

— Time sharing between all the users of a Grid is a major issue in cluster and Grid integration. Classical Grid architecture involves a higher level scheduler which submits non o...

Aurelien Bouteiller, Hinde-Lilia Bouziane, Thomas ...

claim paper

Read More »

« Prev « First page 1 / 1 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers