Sciweavers

695 search results - page 11 / 139
» Cache based fault recovery for distributed systems
Sort
View

Publication
165views
13 years 10 months ago
Task scheduling algorithm for multicore processor system for minimizing recovery time in case of single node fault
In this paper, we propose a task scheduling algorithm for a multicore processor system which reduces the recovery time in case of a single fail-stop failure of a multicore processo...
Shohei Gotoda, Naoki Shibata and Minoru Ito

Presentation
324views
13 years 10 months ago
Task scheduling algorithm for multicore processor system for minimizing recovery time in case of single node fault
In this paper, we propose a task scheduling al-gorithm for a multicore processor system which reduces the recovery time in case of a single fail-stop failure of a multicore process...
ASPLOS
2009
ACM
16 years 5 months ago
ASSURE: automatic software self-healing using rescue points
Software failures in server applications are a significant problem for preserving system availability. We present ASSURE, a system that introduces rescue points that recover softw...
Stelios Sidiroglou, Oren Laadan, Carlos Perez, Nic...
PVM
2005
Springer
15 years 9 months ago
Scalable Fault Tolerant MPI: Extending the Recovery Algorithm
ct Fault Tolerant MPI (FT-MPI)[6] was designed as a solution to allow applications different methods to handle process failures beyond simple check-point restart schemes. The init...
Graham E. Fagg, Thara Angskun, George Bosilca, Jel...
ICSE
2003
IEEE-ACM
15 years 9 months ago
Supporting Dependable Distributed Applications Through a Component-Oriented Middleware-Based Group Service
Abstract. Dependable distributed applications require flexible infrastructure support for controlled redundancy, replication, and recovery of components and services. However, mos...
Katia B. Saikoski, Geoff Coulson