This paper describes a tiling technique that can be used by application programmers and optimizing compilers to obtain I/O-efficient versions of regular scientific loop nests. Due ...
Mahmut T. Kandemir, Alok N. Choudhary, J. Ramanuja...
We present a new cache oblivious scheme for iterative stencil computations that performs beyond system bandwidth limitations as though gigabytes of data could reside in an enormou...
Robert Strzodka, Mohammed Shaheen, Dawid Pajak, Ha...
Many important applications, such as those using sparse data structures, have memory reference patterns that are unknown at compile-time. Prior work has developed runtime reorderi...
Michelle Mills Strout, Larry Carter, Jeanne Ferran...
Exploiting locality at run-time is a complementary approach to a compiler approach for those applications with dynamic memory access patterns. This paper proposes a memory-layout ...
—Analytical models have been used to estimate optimal values for parameters such as tile sizes in the context of loop nests. However, important algorithms such as fast Fourier tr...
Basilio B. Fraguela, Yevgen Voronenko, Markus P&uu...