Sciweavers

695 search results - page 11 / 139
» Cache based fault recovery for distributed systems
Sort
View

Publication
165views
12 years 6 days ago
Task scheduling algorithm for multicore processor system for minimizing recovery time in case of single node fault
In this paper, we propose a task scheduling algorithm for a multicore processor system which reduces the recovery time in case of a single fail-stop failure of a multicore processo...
Shohei Gotoda, Naoki Shibata and Minoru Ito

Presentation
324views
12 years 16 days ago
Task scheduling algorithm for multicore processor system for minimizing recovery time in case of single node fault
In this paper, we propose a task scheduling al-gorithm for a multicore processor system which reduces the recovery time in case of a single fail-stop failure of a multicore process...
ASPLOS
2009
ACM
14 years 7 months ago
ASSURE: automatic software self-healing using rescue points
Software failures in server applications are a significant problem for preserving system availability. We present ASSURE, a system that introduces rescue points that recover softw...
Stelios Sidiroglou, Oren Laadan, Carlos Perez, Nic...
PVM
2005
Springer
14 years 6 days ago
Scalable Fault Tolerant MPI: Extending the Recovery Algorithm
ct Fault Tolerant MPI (FT-MPI)[6] was designed as a solution to allow applications different methods to handle process failures beyond simple check-point restart schemes. The init...
Graham E. Fagg, Thara Angskun, George Bosilca, Jel...
ICSE
2003
IEEE-ACM
13 years 12 months ago
Supporting Dependable Distributed Applications Through a Component-Oriented Middleware-Based Group Service
Abstract. Dependable distributed applications require flexible infrastructure support for controlled redundancy, replication, and recovery of components and services. However, mos...
Katia B. Saikoski, Geoff Coulson