Search Sciweavers | Sciweavers

86 search results - page 6 / 18

» Hybrid checkpointing for parallel applications in cluster fe...

click to vote

PDCAT
2009
Springer

243views Distributed And Parallel Com...» more PDCAT 2009»

CheCUDA: A Checkpoint/Restart Tool for CUDA Applications

14 years 2 months ago

Download www.sc.isc.tohoku.ac.jp

Abstract—In this paper, a tool named CheCUDA is designed to checkpoint CUDA applications that use GPUs as accelerators. As existing checkpoint/restart implementations do not supp...

Hiroyuki Takizawa, Katsuto Sato, Kazuhiko Komatsu,...

claim paper

Read More »

click to vote

HPDC
2000
IEEE

128views Distributed And Parallel Com...» more HPDC 2000»

RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing

13 years 12 months ago

Download www.ecst.csuchico.edu

A new RAID-x (redundant array of inexpensive disks at level x) architecture is presented for distributed I/O processing on a serverless cluster of computers. The RAID-x architectu...

Kai Hwang, Hai Jin, Roy S. C. Ho

claim paper

Read More »

click to vote

CLUSTER
2004
IEEE

180views Distributed And Parallel Com...» more CLUSTER 2004»

Improved message logging versus improved coordinated checkpointing for fault tolerant MPI

13 years 11 months ago

Download www.cs.utk.edu

Fault tolerance is a very important concern for critical high performance applications using the MPI library. Several protocols provide automatic and transparent fault detection a...

Pierre Lemarinier, Aurelien Bouteiller, Thomas H&e...

claim paper

Read More »

click to vote

ICDCS
2012
IEEE

238views Distributed And Parallel Com...» more ICDCS 2012»

Combining Partial Redundancy and Checkpointing for HPC

11 years 10 months ago

Download moss.csc.ncsu.edu

Today’s largest High Performance Computing (HPC) systems exceed one Petaﬂops (1015 ﬂoating point operations per second) and exascale systems are projected within seven years...

James Elliott, Kishor Kharbas, David Fiala, Frank ...

claim paper

Read More »

click to vote

IPPS
2006
IEEE

106views Distributed And Parallel Com...» more IPPS 2006»

Coordinated checkpoint from message payload in pessimistic sender-based message logging

14 years 1 months ago

Download www.cecs.uci.edu

Execution of MPI applications on Clusters and Grid deployments suffers from node and network failure that motivates the use of fault tolerant MPI implementations. Two category tec...

M. Aminian, Mohammad K. Akbari, Bahman Javadi

claim paper

Read More »

« Prev « First page 6 / 18 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers