Faults in Grids: Why are they so bad and What can be done about it?

15 years 12 months ago

Download www.dsc.ufcg.edu.br

Computational Grids have the potential to become the main execution platform for high performance and distributed applications. However, such systems are extremely complex and prone to failures. In this paper, we present a survey with the grid community on which several people shared their actual experience regarding fault treatment. The survey reveals that, nowadays, users have to be highly involved in diagnosing failures, that most failures are due to configuration problems (a hint of the area’s immaturity), and that solutions for dealing with failures are mainly application-dependent. Going further, we identify two main reasons for this state of affairs. First, grid components that provide high-level abstractions when working, do expose all gory details when broken. Since there are no appropriate mechanisms to deal with the complexity exposed (configuration, middleware, hardware and software issues), users need to be deeply involved in the diagnosis and correction of failures, wh...

Raissa Medeiros, Walfredo Cirne, Francisco Vilar B

Real-time Traffic

Complex Failures | Grid | GRID 2003 | Main Execution Platform |

claim paper

Related Content

» Computer Security in the Real World

» Computational Grid as an Appropriate Infrastructure for Ultra Large Scale Software Intensi...

» Hubbased Simulation and Graphics Hardware Accelerated Visualization for Nanotechnology App...

Post Info
More Details (n/a)

Added	06 Jul 2010
Updated	06 Jul 2010
Type	Conference
Year	2003
Where	GRID
Authors	Raissa Medeiros, Walfredo Cirne, Francisco Vilar Brasileiro, Jacques Philippe Sauvé

Comments (0)

Sciweavers

Faults in Grids: Why are they so bad and What can be done about it?

Complex Failures | Grid | GRID 2003 | Main Execution Platform |

Explore & Download

Productivity Tools

Sciweavers