There has been much work recently on improving the locality performance of loop nests in scientific programs through the use of loop as well as data layout optimizations. However,...
Mahmut T. Kandemir, Alok N. Choudhary, J. Ramanuja...
The difficulty of handling out-of-core data limits the performance of supercomputers as well as the potential of the parallel machines. Since writing an efficient out-of-core ve...
Mahmut T. Kandemir, Alok N. Choudhary, J. Ramanuja...
This paper presents a technique for memory optimization for a class of computations that arises in the field of correlated electronic structure methods such as coupled cluster and...
A. Allam, J. Ramanujam, Gerald Baumgartner, P. Sad...
The rapid advances in high-performancecomputer architectureand compilationtechniques provide both challenges and opportunitiesto exploitthe rich solution space of software pipeline...
Ramaswamy Govindarajan, Erik R. Altman, Guang R. G...
The effectiveness of the memory hierarchy is critical for the performance of current processors. The performance of the memory hierarchy can be improved by means of program transf...