Sciweavers

703 search results - page 16 / 141
» MPI Cluster System Software
Sort
View
PVM
2005
Springer
14 years 3 months ago
Implementing Byte-Range Locks Using MPI One-Sided Communication
We present an algorithm for implementing byte-range locks using MPI passive-target one-sided communication. This algorithm is useful in any scenario in which multiple processes of ...
Rajeev Thakur, Robert B. Ross, Robert Latham
CCGRID
2006
IEEE
14 years 4 months ago
Proposal of MPI Operation Level Checkpoint/Rollback and One Implementation
With the increasing number of processors in modern HPC(High Performance Computing) systems, there are two emergent problems to solve. One is scalability, the other is fault tolera...
Yuan Tang, Graham E. Fagg, Jack Dongarra
EUROPAR
2009
Springer
14 years 4 months ago
Impact of Quad-Core Cray XT4 System and Software Stack on Scientific Computation
An upgrade from dual-core to quad-core AMD processor on the Cray XT system at the Oak Ridge National Laboratory (ORNL) Leadership Computing Facility (LCF) has resulted in significa...
Sadaf R. Alam, Richard F. Barrett, Heike Jagode, J...
CLUSTER
2004
IEEE
14 years 1 months ago
Improved message logging versus improved coordinated checkpointing for fault tolerant MPI
Fault tolerance is a very important concern for critical high performance applications using the MPI library. Several protocols provide automatic and transparent fault detection a...
Pierre Lemarinier, Aurelien Bouteiller, Thomas H&e...
IPPS
2007
IEEE
14 years 4 months ago
The Design and Implementation of Checkpoint/Restart Process Fault Tolerance for Open MPI
To be able to fully exploit ever larger computing platforms, modern HPC applications and system software must be able to tolerate inevitable faults. Historically, MPI implementati...
Joshua Hursey, Jeffrey M. Squyres, Timothy Mattox,...