level of abstraction, compared with the program representation for scalar optimizations. For example, loop unrolling and loop unrolland-jam transformations exploit the large regist...
Rakesh Krishnaiyer, Dattatraya Kulkarni, Daniel M....
While application performance and power-efficiency are both important, application correctness is even more important. In other words, if the application is misbehaving, it is li...
Shimin Chen, Phillip B. Gibbons, Michael Kozuch, T...
We present a parallel data processor centered around a programming model of so called Parallelization Contracts (PACTs) and the scalable parallel execution engine Nephele [18]. Th...
Multi-stage programming (MSP) provides a disciplined approach to run-time code generation. In the purely functional setting, it has been shown how MSP can be used to reduce the ov...
Edwin Westbrook, Mathias Ricken, Jun Inoue, Yilong...
Most image processing algorithms can be parallelized by splitting parallel loops and by using very few communication patterns. Code parallelization using MPI still involves much p...