Abstract--Loop tiling is an important compiler transformation used for enhancing data locality and exploiting coarsegrained parallelism. Tiled codes in which tile sizes are runtime parameters--called parametrically-tiled codes--are important for empirical tuning systems like ATLAS. Some recent work has addressed the problem of generating sequential parametric tiled code. In this paper we describe DynTile, a system for transforming untiled sequential input C code containing affine imperfectly nested loops to parametrically tiled code for parallel execution on multicore processors. The effectiveness of the system is demonstrated using a number of benchmarks on an eight-core system.
Albert Hartono, Muthu Manikandan Baskaran, J. Rama