Abstract. We present two new compiler optimizations for explicitly parallel programs based on the CSSAME form: Lock-Independent Code Motion (LICM) and Mutex Body Localization (MBL)...
Diego Novillo, Ronald C. Unrau, Jonathan Schaeffer
Well designed domain specific languages enable the easy expression of problems, the application of domain specific optimizations, and dramatic improvements in productivity for t...
Jun Cao, Ayush Goyal, Samuel P. Midkiff, James M. ...
Data locality is critical to achievinghigh performance on large-scale parallel machines. Non-local data accesses result in communication that can greatly impact performance. Thus ...
In this paper we describe techniques for compiling finegrained SPMD-threaded programs, expressed in programming models such as OpenCL or CUDA, to multicore execution platforms. Pr...
John A. Stratton, Vinod Grover, Jaydeep Marathe, B...
As parallel machines become part of the mainstream computing environment, compilers will need to apply synchronization optimizations to deliver e cient parallel software. This pap...