Abstract. High Performance Fortran (hpf) is a data-parallel Fortran for Distributed Memory Multiprocessors. Hpf provides an interesting programming model but compilers are yet to c...
The growing influence of wire delay in cache design has meant that access latencies to last-level cache banks are no longer constant. Non-Uniform Cache Architectures (NUCAs) have ...
There has been much work recently on improving the locality performance of loop nests in scientific programs through the use of loop as well as data layout optimizations. However,...
Mahmut T. Kandemir, Alok N. Choudhary, J. Ramanuja...
With the widening performance gap between processors and main memory, efficient memory accessing behavior is necessary for good program performance. Loop partition is an effective...
The Cell processor is a heterogeneous multi-core processor with one Power Processing Engine (PPE) core and eight Synergistic Processing Engine (SPE) cores. Each SPE has a directly...
Kevin O'Brien, Kathryn M. O'Brien, Zehra Sura, Ton...