This paper presents a new compiler optimization algorithm that parallelizes applications for symmetric, sharedmemory multiprocessors. The algorithm considers data locality, parall...
The increasing complexity of hardware features for recent processors makes high performance code generation very challenging. In particular, several optimization targets have to b...
Loop unrolling is one of the most promising parallelization techniques, because the nature of programs causes most of the processing time to be spent in their loops. Unrolling not...
Tiling is a crucial loop transformation for generating high performance code on modern architectures. Efficient generation of multilevel tiled code is essential for maximizing da...
Albert Hartono, Muthu Manikandan Baskaran, C&eacut...