Data locality is critical to achievinghigh performance on large-scale parallel machines. Non-local data accesses result in communication that can greatly impact performance. Thus ...
Exploiting parallelism at both the multiprocessor level and the instruction level is an e ective means for supercomputers to achieve high-performance. The amount of instruction-le...
Scott A. Mahlke, William Y. Chen, John C. Gyllenha...
Storage mapping optimization is a flexible approach to folding array dimensions in numerical codes. It is designed to reduce the memory footprint after a wide spectrum of loop tr...
We present a new approach that enables compiler optimization of procedure calls and loop nests containing procedure calls. We introduce two interprocedural transformationsthat mov...
To take advantage of recent architectural improvements in microprocessors, advanced compiler optimizations such as software pipelining have been developed 1, 2, 3, 4]. Unfortunate...