This paper describes a tiling technique that can be used by application programmers and optimizing compilers to obtain I/O-efficient versions of regular scientific loop nests. Due ...
Mahmut T. Kandemir, Alok N. Choudhary, J. Ramanuja...
This paper applies unimodular transformations and tiling to improve data locality of a loop nest. Due to data dependences and reuse information, not all dimensions of the iteration...
Abstract. Traditionally, loop nests are fused only when the data dependences in the loop nests are not violated. This paper presents a new loop fusion algorithm that is capable of ...
We present a new approach that enables compiler optimization of procedure calls and loop nests containing procedure calls. We introduce two interprocedural transformationsthat mov...
To meet the conflicting goals of high-performance low-cost embedded systems, critical application loop nests are commonly executed on specialized hardware accelerators. These loop...
Kevin Fan, Manjunath Kudlur, Hyunchul Park, Scott ...
In this paper, we propose to model memory access profile information as loop nests exhibiting useful characteristics on the memory behavior, such as periodicity, linearly linked m...
A technique for estimating the cost of executing a loop nest in parallel (parallel start-up overhead) is described in this paper. This technique is of utmost importance for paralle...
This paper analyzes and quantifies the locality characteristics of numerical loop nests in order to suggest future directions for architecture and software cache optimizations. Si...
Global locality analysis is a technique for improving the cache performance of a sequence of loop nests through a combination of loop and data layout optimizations. Pure loop tran...
Mahmut T. Kandemir, Alok N. Choudhary, J. Ramanuja...
This paper presents a novel source code transformation for control flow optimizationcalled loop nest splitting which minimizes the number of executed if-statements in loop nests ...