Recent advances in polyhedral compilation technology have made it feasible to automatically transform affine sequential loop nests for tiled parallel execution on multi-core proce...
Abstract--Loop tiling is an important compiler transformation used for enhancing data locality and exploiting coarsegrained parallelism. Tiled codes in which tile sizes are runtime...
Albert Hartono, Muthu Manikandan Baskaran, J. Rama...
—Thread-level parallelism (TLP) has been extensively studied in order to overcome the limitations of exploiting instruction-level parallelism (ILP) on high-performance superscala...
With the widening performance gap between processors and main memory, efficient memory accessing behavior is necessary for good program performance. Loop partition is an effective...
This paper describes a generalisation of modulo scheduling to parallelise loops for SpMT processors that exploits simultaneously both instruction-level parallelism and thread-leve...
Lin Gao 0002, Quan Hoang Nguyen, Lian Li 0002, Jin...