Search Sciweavers | Sciweavers

698 search results - page 28 / 140

» Synthesis of Fault-Tolerant Distributed Systems

196

click to vote

IPPS
1998
IEEE

133views Distributed And Parallel Com...» more IPPS 1998»

A Flexible Approach for a Fault-Tolerant Router

15 years 11 months ago

Download ipdps.cc.gatech.edu

: Cluster systems gain more and more importance as a platform for parallel computing. In this area the power of the system is strongly coupled with the performance of the network, ...

Andreas C. Döring, Wolfgang Obelöer, Gun...

claim paper

Read More »

188

Voted

IPPS
2008
IEEE

136views Distributed And Parallel Com...» more IPPS 2008»

Enhancing application robustness through adaptive fault tolerance

16 years 1 months ago

Download www.cs.iit.edu

As the scale of high performance computing (HPC) continues to grow, application fault resilience becomes crucial. To address this problem, we are working on the design of an adapt...

Zhiling Lan, Yawei Li, Ziming Zheng, Prashasta Guj...

claim paper

Read More »

230

click to vote

CCGRID
2003
IEEE

133views Distributed And Parallel Com...» more CCGRID 2003»

Improved Read Performance in a Cost-Effective, Fault-Tolerant Parallel Virtual File System (CEFT-PVFS)

16 years 23 days ago

Download cse.unl.edu

Due to the ever-widening performance gap between processors and disks, I/O operations tend to become the major performance bottleneck of data-intensive applications on modern clus...

Yifeng Zhu, Hong Jiang, Xiao Qin, Dan Feng, David ...

claim paper

Read More »

178

click to vote

IPPS
2005
IEEE

159views Distributed And Parallel Com...» more IPPS 2005»

Current Practice and a Direction Forward in Checkpoint/Restart Implementations for Fault Tolerance

16 years 1 months ago

Download hpc.pnl.gov

Checkpoint/restart is a general idea for which particular implementations enable various functionalities in computer systems, including process migration, gang scheduling, hiberna...

José Carlos Sancho, Fabrizio Petrini, Kei D...

claim paper

Read More »

219

click to vote

CCGRID
2008
IEEE

129views Distributed And Parallel Com...» more CCGRID 2008»

Fault Tolerance in Cluster Federations with O2P-CF

15 years 9 months ago

Download xcr.cenit.latech.edu

Fault tolerance is one of the key issues for large scale applications executed on high performance computing systems. In a cluster federation, clusters are gathered to provide hug...

Thomas Ropars, Christine Morin

claim paper

Read More »

« Prev « First page 28 / 140 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers