Implementation of Fault-Tolerant GridRPC Applications

15 years 6 months ago

Download www.gridforum.org

In this paper, a task parallel application is implemented with Ninf-G which is a GridRPC system, and experimented on, using the Grid testbed in Asia Pacific, for three months. The application is programmed to run for a long time and typical fault patterns were gathered through tens of long executions. As a result, unstable network throughput was determined to be one of the biggest reasons for faults. Then, an important point for application developers is stressed, reminding them to avoid serious decline of task throughput during operations for faults, by timeout minimization for fault detection, background recovery and duplicate task assignments. This study also issues a steer for design of the automated fault-tolerant mechanism in a higher layer of the GridRPC framework.

Yusuke Tanimura, Tsutomu Ikegami, Hidemoto Nakada,

Real-time Traffic

Distributed And Parallel Computing | GRID 2006 | Long Executions | Task Parallel Application | Typical Fault Patterns |

claim paper

» Design Implementation and Performance Evaluation of GridRPC Programming Middleware for a L...

» Design and Implementation of a Pluggable Fault Tolerant CORBA Infrastructure

» Experimental Study of Multicriteria Scheduling Heuristics for GridRPC Systems

» High performance linpack benchmark a fault tolerant implementation without checkpointing

» Gateways for Accessing Fault Tolerance Domains

» A Framework for Proactive Fault Tolerance

» A Metaobject Architecture for FaultTolerant Distributed Systems The FRIENDS Approach

» HARNESS fault tolerant MPI design usage and performance issues

Post Info
More Details (n/a)

Added	12 Dec 2010
Updated	12 Dec 2010
Type	Journal
Year	2006
Where	GRID
Authors	Yusuke Tanimura, Tsutomu Ikegami, Hidemoto Nakada, Yoshio Tanaka, Satoshi Sekiguchi

Comments (0)

Sciweavers

Implementation of Fault-Tolerant GridRPC Applications

Distributed And Parallel Computing | GRID 2006 | Long Executions | Task Parallel Application | Typical Fault Patterns |

Explore & Download

Productivity Tools

Sciweavers