Sciweavers

3886 search results - page 68 / 778
» Implementing Fault-Tolerant Distributed Applications
Sort
View
DSN
2004
IEEE
13 years 11 months ago
Improving System Dependability with Functional Alternatives
We present the concept of alternative functionality for improving dependability in distributed embedded systems. Alternative functionality is a mechanism that complements traditio...
Charles P. Shelton, Philip Koopman
OSDI
1996
ACM
13 years 9 months ago
Microkernels Meet Recursive Virtual Machines
This paper describes a novel approach to providingmodular and extensible operating system functionality and encapsulated environments based on a synthesis of microkernel and virtu...
Bryan Ford, Mike Hibler, Jay Lepreau, Patrick Tull...
APPT
2009
Springer
14 years 2 months ago
Evaluating SPLASH-2 Applications Using MapReduce
MapReduce has been prevalent for running data-parallel applications. By hiding other non-functionality parts such as parallelism, fault tolerance and load balance from programmers,...
Shengkai Zhu, Zhiwei Xiao, Haibo Chen, Rong Chen, ...
ICPPW
2009
IEEE
13 years 5 months ago
Analyzing Checkpointing Trends for Applications on the IBM Blue Gene/P System
Current petascale systems have tens of thousands of hardware components and complex system software stacks, which increase the probability of faults occurring during the lifetime ...
Harish Gapanati Naik, Rinku Gupta, Pete Beckman
HPCC
2010
Springer
13 years 7 months ago
A Generic Execution Management Framework for Scientific Applications
Managing the execution of scientific applications in a heterogeneous grid computing environment can be a daunting task, particularly for long running jobs. Increasing fault tolera...
Tanvire Elahi, Cameron Kiddle, Rob Simmonds