Construction of complex array operations by composition of more basic ones allows for abstract and concise specifications of algorithms. Unfortunately, na¨ıve compilation of suc...
In a cluster of multiple processors or cpu-cores, many processes may run on each compute node. Each process tends to issue contiguous I/O requests for snapshot, checkpointing or s...
Gather and scatter are data redistribution functions of longstanding importance to high performance computing. In this paper, we present a highly-general array operator with power...
Steven J. Deitz, Bradford L. Chamberlain, Sung-Eun...
Tiling has proven to be an effective mechanism to develop high performance implementations of algorithms. Tiling can be used to organize computations so that communication costs i...
Ganesh Bikshandi, Jia Guo, Daniel Hoeflinger, Gheo...
Abstract—Many studies have shown that load imbalancing causes significant performance degradation in High Performance Computing (HPC) applications. Nowadays, Multi-Threaded (MT1...
Carlos Boneti, Roberto Gioiosa, Francisco J. Cazor...