Search Sciweavers | Sciweavers

535 search results - page 14 / 107

» Fault tolerant high performance computing by a coding approa...

250

click to vote

ACTA
2005

104views more ACTA 2005»

Optimal recovery schemes in fault tolerant distributed computing

15 years 7 months ago

Download www.ipd.bth.se

Clusters and distributed systems offer fault tolerance and high performance through load sharing. When all n computers are up and running, we would like the load to be evenly distr...

Kamilla Klonowska, Håkan Lennerstad, Lars Lu...

claim paper

Read More »

197

Voted

FTCS
1996

132views more FTCS 1996»

An Approach towards Benchmarking of Fault-Tolerant Commercial Systems

15 years 8 months ago

Download www.tsai-family.com

This paper presents a benchmark for dependablesystems. The benchmark consists of two metrics, number of catastrophic incidents and performance degradation, which are obtained by a...

Timothy K. Tsai, Ravishankar K. Iyer, Doug Jewitt

claim paper

Read More »

218

click to vote

CCGRID
2006
IEEE

131views Distributed And Parallel Com...» more CCGRID 2006»

Proposal of MPI Operation Level Checkpoint/Rollback and One Implementation

16 years 1 months ago

Download icl.cs.utk.edu

With the increasing number of processors in modern HPC(High Performance Computing) systems, there are two emergent problems to solve. One is scalability, the other is fault tolera...

Yuan Tang, Graham E. Fagg, Jack Dongarra

claim paper

Read More »

174

click to vote

IPPS
2006
IEEE

89views Distributed And Parallel Com...» more IPPS 2006»

An advanced performance analysis of self-stabilizing protocols: stabilization time with transient faults during convergence

16 years 1 months ago

Download www.cecs.uci.edu

A self-stabilizing protocol is a brilliant framework for fault tolerance. It can recover from any number and any type of transient faults and eventually converge to its intended b...

Yoshihiro Nakaminami, Hirotsugu Kakugawa, Toshimit...

claim paper

Read More »

176

click to vote

IPPS
2003
IEEE

124views Distributed And Parallel Com...» more IPPS 2003»

Using Golomb Rulers for Optimal Recovery Schemes in Fault Tolerant Distributed Computing

16 years 23 days ago

Download www.ipd.bth.se

Clusters and distributed systems offer fault tolerance and high performance through load sharing. When all computers are up and running, we would like the load to be evenly distrib...

Kamilla Klonowska, Lars Lundberg, Håkan Lenn...

claim paper

Read More »

« Prev « First page 14 / 107 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers