Abstract. Traditionally, loop nests are fused only when the data dependences in the loop nests are not violated. This paper presents a new loop fusion algorithm that is capable of ...
In this paper, we present a thorough analysis of thread-level parallelism available in production High Performance Computing (HPC) codes. We survey a number of techniques that are...
Data locality and synchronization overhead are two important factors that affect the performance of applications on multiprocessors. Loop fusion is an effective way for reducing s...
Edwin Hsing-Mean Sha, Chenhua Lang, Nelson L. Pass...
For some sequential loops, existing techniques that form speculative threads only at their loop boundaries do not adequately expose the speculative parallelism inherent in them. T...
Lin Gao 0002, Lian Li 0002, Jingling Xue, Tin-Fook...
Loop unrolling is one of the most promising parallelization techniques, because the nature of programs causes most of the processing time to be spent in their loops. Unrolling not...