Sciweavers

2226 search results - page 4 / 446
» Fault-Tolerant Parallel Applications with Dynamic Parallel S...
Sort
View
CLUSTER
2006
IEEE
14 years 1 months ago
FAIL-MPI: How Fault-Tolerant Is Fault-Tolerant MPI?
One of the topics of paramount importance in the development of Cluster and Grid middleware is the impact of faults since their occurrence in Grid infrastructures and in large-sca...
William Hoarau, Pierre Lemarinier, Thomas Hé...
DSN
2007
IEEE
14 years 1 months ago
Using Process-Level Redundancy to Exploit Multiple Cores for Transient Fault Tolerance
Transient faults are emerging as a critical concern in the reliability of general-purpose microprocessors. As architectural trends point towards multi-threaded multi-core designs,...
Alex Shye, Tipp Moseley, Vijay Janapa Reddi, Josep...
IPPS
2005
IEEE
14 years 1 months ago
Combining FT-MPI with H2O: Fault-Tolerant MPI Across Administrative Boundaries
We observe increasing interest in aggregating geographically distributed, heterogeneous resources to perform large scale computations. MPI remains the most popular programming par...
Dawid Kurzyniec, Vaidy S. Sunderam
HPCA
2003
IEEE
14 years 8 months ago
Dynamic Data Replication: An Approach to Providing Fault-Tolerant Shared Memory Clusters
A challenging issue in today's server systems is to transparently deal with failures and application-imposed requirements for continuous operation. In this paper we address t...
Rosalia Christodoulopoulou, Reza Azimi, Angelos Bi...