Abstract. Today’s large finite element simulations require parallel algorithms to scale on clusters with thousands or tens of thousands of processor cores. We present data struc...
Timo Heister, Martin Kronbichler, Wolfgang Bangert...
Abstract. In this work we present two algorithms of irregular scatter/gather operations based on the binomial tree and Tr¨aff algorithms. We use the prediction provided by hetero...
Kiril Dichev, Vladimir Rychkov, Alexey L. Lastovet...
A major trend in HPC is the escalation toward manycore, where systems are composed of shared memory nodes featuring numerous processing units. Unfortunately, with scale comes compl...
Teng Ma, George Bosilca, Aurelien Bouteiller, Jack...
Abstract. VolpexMPI is an MPI library designed for volunteer computing environments. In order to cope with the fundamental unreliability of these environments, VolpexMPI deploys tw...
Parallel programming models on large-scale systems require a scalable system for managing the processes that make up the execution of a parallel program. The process-management sys...
Pavan Balaji, Darius Buntinas, David Goodell, Will...
With the ever-increasing numbers of cores per node on HPC systems, applications are increasingly using threads to exploit the shared memory within a node, combined with MPI across ...
MPI datatypes are a convenient abstraction for manipulating complex data structures and are useful in a number of contexts. In some cases, these descriptions need to be preserved o...
Abstract. With the number of computing elements spiraling to hundred of thousands in modern HPC systems, failures are common events. Few applications are nevertheless fault toleran...
George Bosilca, Aurelien Bouteiller, Thomas H&eacu...
Multicore processors have not only reintroduced Non-Uniform Memory Access (NUMA) architectures in nowadays parallel computers, but they are also responsible for non-uniform access ...