Sciweavers

22 search results - page 1 / 5
» Using replication and checkpointing for reliable task manage...
Sort
View
IEEEHPCS
2010
13 years 4 months ago
Using replication and checkpointing for reliable task management in computational Grids
In grid computing systems, providing fault-tolerance is required for both scientific computation and file-sharing to increase their reliability. In previous works, several mechani...
Sangho Yi, Derrick Kondo, Bongjae Kim, Geunyoung P...
ICCS
2007
Springer
14 years 29 days ago
Providing Fault-Tolerance in Unreliable Grid Systems Through Adaptive Checkpointing and Replication
Abstract. As grids typically consist of autonomously managed subsystems with strongly varying resources, fault-tolerance forms an important aspect of the scheduling process of appl...
Maria Chtepen, Filip H. A. Claeys, Bart Dhoedt, Fi...
GRID
2004
Springer
14 years 6 days ago
Checkpoint and Restart for Distributed Components in XCAT3
With the advent of Grid computing, more and more highend computational resources become available for use to a scientist. While this opens up new avenues for scientific research,...
Sriram Krishnan, Dennis Gannon
HPCC
2009
Springer
13 years 4 months ago
Graph-Based Task Replication for Workflow Applications
Abstract--The Grid is an heterogeneous and dynamic environment which enables distributed computation. This makes it a technology prone to failures. Some related work uses replicati...
Raúl Sirvent, Rosa M. Badia, Jesús L...
HPDC
2008
IEEE
14 years 1 months ago
Dynasa: adapting grid applications to safety using fault-tolerant methods
Grid applications have been prone to encountering problems such as failures or malicious attacks during execution, due to their distributed and large-scale features. The applicati...
Xuanhua Shi, Jean-Louis Pazat, Eric Rodriguez, Hai...