The most intuitive memory model for shared-memory multithreaded programming is sequential consistency (SC), but it disallows the use of many compiler and hardware optimizations th...
Daniel Marino, Abhayendra Singh, Todd D. Millstein...
On a distributed memory machine, hand-coded message passing leads to the most efficient execution, but it is difficult to use. Parallelizing compilers can approach the performance...
We present a new approach that enables compiler optimization of procedure calls and loop nests containing procedure calls. We introduce two interprocedural transformationsthat mov...
The rapid growth of silicon densities has made it feasible to deploy reconfigurable hardware as a highly parallel computing platform. However, in most cases, the application needs...
Girish Venkataramani, Walid A. Najjar, Fadi J. Kur...
Parallelizing compiler technology has improved in recent years. One area in which compilers have made progress is in handling DOACROSS loops, where crossprocessor data dependencie...