- In this paper we experiment with two optimization techniques we are considering implementing in a parallelizing compiler that generates parallel code for a distributed-memory sys...
In this paper, we propose a technique combining loop distribution with loop fusion to improve the timing performance without increasing the code size of the transformed loops. We ...
Meilin Liu, Qingfeng Zhuge, Zili Shao, Chun Xue, M...
Loop distribution and loop fusion are two effective loop transformation techniques to optimize the execution of the programs in DSP applications. In this paper, we propose a new t...
Loop unrolling is one of the most promising parallelization techniques, because the nature of programs causes most of the processing time to be spent in their loops. Unrolling not...
Tiling is a crucial loop transformation for generating high performance code on modern architectures. Efficient generation of multilevel tiled code is essential for maximizing da...
Albert Hartono, Muthu Manikandan Baskaran, C&eacut...