We present a cache locality optimization technique that can optimize a loop nest even if the arrays referenced have different layouts in memory. Such a capability is required for a...
Mahmut T. Kandemir, J. Ramanujam, Alok N. Choudhar...
Time skewing and loop tiling has been known for a long time to be a highly beneficial acceleration technique for nested loops especially on bandwidth hungry multi-core processors...
Data dominated signal processing applications are typically described using large and multi-dimensional arrays and loop nests. The order of production and consumption of array ele...
Per Gunnar Kjeldsberg, Francky Catthoor, Sven Verd...
In this paper, we evaluate an adaptive loop parallelization strategy (i.e., a strategy that allows each loop nest to execute using different number of processors if doing so is be...
Ismail Kadayif, Mahmut T. Kandemir, Mustafa Karak&...
We want to perform compile-time analysis of an SPMD program and place barriers in it to synchronize it correctly, minimizing the runtime cost of the synchronization. This is the b...