Abstract. We present two new compiler optimizations for explicitly parallel programs based on the CSSAME form: Lock-Independent Code Motion (LICM) and Mutex Body Localization (MBL)...
Diego Novillo, Ronald C. Unrau, Jonathan Schaeffer
Given the large communication overheads characteristic of modern parallel machines, optimizations that eliminate, hide or parallelize communication may improve the performance of ...
I consider the problem of the domain-specific optimization of programs. I review different approaches, discuss their potential, and sketch instances of them from the practice of ...
This paper presents an extensive characterization, tuning, and optimization of parallel I/O on the Cray XT supercomputer, named Jaguar, at Oak Ridge National Laboratory. We have c...
In this paper, we consider the problem of nding llpreserving ordering of a sparse symmetric and positive de nite matrix such that the reordered matrix is suitable for parallel fac...