: This paper presents a Data-Distributed Execution approach that exploits interation-level parallelism in loops operating over arrays. It performs data-dependency analysis, based on which arrays are distributed over the different local memories. The code is then transformed to “follow” the data distribution by spawning each loop on all PEs concurrently but modifying its boundary conditions so that each operates mostly on the local sub-ranges of the data, thus reducing remote access to a minimum. The approach has been tested on the EM-4 supercomputer by implementing several benchmark programs. The experiments show that high speedup is achieved by automatic parallelization of conventional Fortran-Like programs.
Lubomir Bic, Mayez A. Al-Mouhamed