Sciweavers

106 search results - page 11 / 22
» Transparent Fault Tolerance for Grid Applications
Sort
View
ICS
2007
Tsinghua U.
14 years 2 months ago
Proactive fault tolerance for HPC with Xen virtualization
Large-scale parallel computing is relying increasingly on clusters with thousands of processors. At such large counts of compute nodes, faults are becoming common place. Current t...
Arun Babu Nagarajan, Frank Mueller, Christian Enge...
EUROSYS
2011
ACM
13 years 2 days ago
Refuse to crash with Re-FUSE
We introduce Re-FUSE, a framework that provides support for restartable user-level file systems. Re-FUSE monitors the user-level file-system and on a crash transparently restart...
Swaminathan Sundararaman, Laxman Visampalli, Andre...
CCGRID
2008
IEEE
14 years 3 months ago
An Autonomic Workflow Management System for Global Grids
Workflow Management System is generally utilized to define, manage and execute workflow applications on Grid resources. However, the increasing scale complexity, heterogeneity and...
Mustafizur Rahman 0003, Rajkumar Buyya
LCPC
2009
Springer
14 years 1 months ago
A Communication Framework for Fault-Tolerant Parallel Execution
PC grids represent massive computation capacity at a low cost, but are challenging to employ for parallel computing because of variable and unpredictable performance and availabili...
Nagarajan Kanna, Jaspal Subhlok, Edgar Gabriel, Es...
ESCIENCE
2006
IEEE
14 years 2 months ago
Practical Fault-Tolerant Framework for eScience Infrastructure
Many areas of science currently use computing resources as a important part of their research, and many research groups adopt cluster architecture to use them efficiently and mana...
Hyuck Han, Jai Wug Kim, Jongpil Lee, Youngjin Yu, ...