Sciweavers

GRID
2003
Springer

Faults in Grids: Why are they so bad and What can be done about it?

14 years 4 months ago
Faults in Grids: Why are they so bad and What can be done about it?
Computational Grids have the potential to become the main execution platform for high performance and distributed applications. However, such systems are extremely complex and prone to failures. In this paper, we present a survey with the grid community on which several people shared their actual experience regarding fault treatment. The survey reveals that, nowadays, users have to be highly involved in diagnosing failures, that most failures are due to configuration problems (a hint of the area’s immaturity), and that solutions for dealing with failures are mainly application-dependent. Going further, we identify two main reasons for this state of affairs. First, grid components that provide high-level abstractions when working, do expose all gory details when broken. Since there are no appropriate mechanisms to deal with the complexity exposed (configuration, middleware, hardware and software issues), users need to be deeply involved in the diagnosis and correction of failures, wh...
Raissa Medeiros, Walfredo Cirne, Francisco Vilar B
Added 06 Jul 2010
Updated 06 Jul 2010
Type Conference
Year 2003
Where GRID
Authors Raissa Medeiros, Walfredo Cirne, Francisco Vilar Brasileiro, Jacques Philippe Sauvé
Comments (0)