Instance based locality optimization 6 is a semi automatic program restructuring method that reduces the number of cache misses. The method imitates the human approach of consideri...
The performance of irregular applications on modern computer systems is hurt by the wide gap between CPU and memory speeds because these applications typically underutilize multi-...
John M. Mellor-Crummey, David B. Whalley, Ken Kenn...
The performance of a concurrent multithreaded architectural model, called superthreading 15 , is studied in this paper. It tries to integrate optimizing compilation techniques and...
Jenn-Yuan Tsai, Zhenzhen Jiang, Eric Ness, Pen-Chu...
Global locality analysis is a technique for improving the cache performance of a sequence of loop nests through a combination of loop and data layout optimizations. Pure loop tran...
Mahmut T. Kandemir, Alok N. Choudhary, J. Ramanuja...
The fast evolution of processor performance necessitates a permanent evolution of all the multiprocessor components, even for small to medium-scale symmetric multiprocessors (SMP)...
Wissam Hlayhel, Daniel Litaize, Laurent Fesquet, J...