Sciweavers

153 search results - page 10 / 31
» Supporting fault-tolerance for time-critical events in distr...
Sort
View
DSN
2003
IEEE
14 years 27 days ago
Integrating Recovery Strategies into a Primary Substation Automation System
The DepAuDE architecture provides middleware to integrate fault tolerance support into distributed embedded automation applications. It allows error recovery to be expressed in te...
Geert Deconinck, Vincenzo De Florio, Ronnie Belman...
HPDC
2009
IEEE
14 years 2 months ago
Interconnect agnostic checkpoint/restart in open MPI
Long running High Performance Computing (HPC) applications at scale must be able to tolerate inevitable faults if they are to harness current and future HPC systems. Message Passi...
Joshua Hursey, Timothy Mattox, Andrew Lumsdaine
IPPS
2007
IEEE
14 years 1 months ago
The Next Generation Software Workshop - IPDPS'07
This workshop provides a forum for an overview, project presentations, and discussion of the research fostered and funded initially by the NSF Next Generation Software (NGS) Progr...
Frederica Darema
ICDCS
2008
IEEE
14 years 2 months ago
stdchk: A Checkpoint Storage System for Desktop Grid Computing
— Checkpointing is an indispensable technique to provide fault tolerance for long-running high-throughput applications like those running on desktop grids. This paper argues that...
Samer Al-Kiswany, Matei Ripeanu, Sudharshan S. Vaz...
IPPS
2000
IEEE
13 years 12 months ago
A Parallel Co-evolutionary Metaheuristic
In order to show that the parallel co-evolution of di erent heuristic methods may lead to an e cient search strategy, we have hybridized three heuristic agents of complementary beh...
Vincent Bachelet, El-Ghazali Talbi