Search Sciweavers | Sciweavers

146 search results - page 2 / 30

» Transparent Checkpoint-Restart of Distributed Applications o...

214

click to vote

HIPC
2009
Springer

146views Distributed And Parallel Com...» more HIPC 2009»

Fast checkpointing by Write Aggregation with Dynamic Buffer and Interleaving on multicore architecture

15 years 4 months ago

Download nowlab.cse.ohio-state.edu

Large scale compute clusters continue to grow to ever-increasing proportions. However, as clusters and applications continue to grow, the Mean Time Between Failures (MTBF) has redu...

Xiangyong Ouyang, Karthik Gopalakrishnan, Tejus Ga...

claim paper

Read More »

231

click to vote

ICPP
2009
IEEE

185views Distributed And Parallel Com...» more ICPP 2009»

Accelerating Checkpoint Operation by Node-Level Write Aggregation on Multicore Systems

16 years 1 months ago

Download nowlab.cse.ohio-state.edu

—Clusters and applications continue to grow in size while their mean time between failure (MTBF) is getting smaller. Checkpoint/Restart is becoming increasingly important for lar...

Xiangyong Ouyang, Karthik Gopalakrishnan, Dhabales...

claim paper

Read More »

171

click to vote

PVM
2005
Springer

78views Distributed And Parallel Com...» more PVM 2005»

Scalable Fault Tolerant MPI: Extending the Recovery Algorithm

16 years 12 days ago

Download icl.cs.utk.edu

ct Fault Tolerant MPI (FT-MPI)[6] was designed as a solution to allow applications diﬀerent methods to handle process failures beyond simple check-point restart schemes. The init...

Graham E. Fagg, Thara Angskun, George Bosilca, Jel...

claim paper

Read More »

169

click to vote

IPPS
2005
IEEE

117views Distributed And Parallel Com...» more IPPS 2005»

User Transparent Parallel Processing of the 2004 NIST TRECVID Data Set

16 years 16 days ago

Download staff.science.uva.nl

The Parallel-Horus framework, developed at the University of Amsterdam, is a unique software architecture that allows non-expert parallel programmers to develop fully sequential m...

Frank J. Seinstra, Cees Snoek, Dennis Koelma, Jan-...

claim paper

Read More »

190

click to vote

PPOPP
2006
ACM

143views Distributed And Parallel Com...» more PPOPP 2006»

Fast and transparent recovery for continuous availability of cluster-based servers

16 years 27 days ago

Download www.ics.forth.gr

Recently there has been renewed interest in building reliable servers that support continuous application operation. Besides maintaining system state consistent after a failure, o...

Rosalia Christodoulopoulou, Kaloian Manassiev, Ang...

claim paper

Read More »

« Prev « First page 2 / 30 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers