Sciweavers

442 search results - page 65 / 89
» Fault Tolerant Wide-Area Parallel Computing
Sort
View
EGC
2005
Springer
14 years 1 months ago
Workflow-Oriented Collaborative Grid Portals
The paper presents how workflow-oriented, single-user Grid portals could be extended to meet the requirements of users with collaborative needs. Through collaborative Grid portals ...
Gergely Sipos, Gareth J. Lewis, Péter Kacsu...
ISPA
2004
Springer
14 years 1 months ago
Highly Reliable Linux HPC Clusters: Self-Awareness Approach
Abstract. Current solutions for fault-tolerance in HPC systems focus on dealing with the result of a failure. However, most are unable to handle runtime system configuration change...
Chokchai Leangsuksun, Tong Liu, Yudan Liu, Stephen...
ISORC
2003
IEEE
14 years 28 days ago
A Dynamic Shadow Approach for Mobile Agents to Survive Crash Failures
Fault tolerance schemes for mobile agents to survive agent server crash failures are complex since developers normally have no control over remote agent servers. Some solutions mo...
Simon Pears, Jie Xu, Cornelia Boldyreff
PPOPP
2003
ACM
14 years 27 days ago
Automated application-level checkpointing of MPI programs
Because of increasing hardware and software complexity, the running time of many computational science applications is now more than the mean-time-to-failure of highpeformance com...
Greg Bronevetsky, Daniel Marques, Keshav Pingali, ...
HPDC
2008
IEEE
14 years 2 months ago
Dynasa: adapting grid applications to safety using fault-tolerant methods
Grid applications have been prone to encountering problems such as failures or malicious attacks during execution, due to their distributed and large-scale features. The applicati...
Xuanhua Shi, Jean-Louis Pazat, Eric Rodriguez, Hai...