Sciweavers

139 search results - page 15 / 28
» Software Fault Tolerance of Distributed Programs Using Compu...
Sort
View
HIPC
2000
Springer
13 years 11 months ago
Experiments with the CHIME Parallel Processing System
: This paper presents the results from running five experiments with the Chime Parallel Processing System. The Chime System is an implementation of the CC++ programming language (p...
Anjaneya R. Chagam, Partha Dasgupta, Rajkumar Khan...
IPPS
1998
IEEE
13 years 12 months ago
A Generalized Forward Recovery Checkpointing Scheme
We propose a generalized forward recovery checkpointing scheme, with lookahead execution and rollback validation. This method takes advantage of voting and comparison on multiple v...
Ke Huang, Jie Wu, Eduardo B. Fernández
HPDC
2008
IEEE
14 years 2 months ago
Dynasa: adapting grid applications to safety using fault-tolerant methods
Grid applications have been prone to encountering problems such as failures or malicious attacks during execution, due to their distributed and large-scale features. The applicati...
Xuanhua Shi, Jean-Louis Pazat, Eric Rodriguez, Hai...
IPPS
1998
IEEE
13 years 12 months ago
Migration and Rollback Transparency for Arbitrary Distributed Applications in Workstation Clusters
Programmers and users of compute intensive scientific applications often do not want to (or even cannot) code load balancing and fault tolerance into their programs. The PBEAM syst...
Stefan Petri, Matthias Bolz, Horst Langendörf...
ICSE
2003
IEEE-ACM
14 years 24 days ago
Supporting Dependable Distributed Applications Through a Component-Oriented Middleware-Based Group Service
Abstract. Dependable distributed applications require flexible infrastructure support for controlled redundancy, replication, and recovery of components and services. However, mos...
Katia B. Saikoski, Geoff Coulson