In this paper we present the internal representation and optimizations used by the CASH compiler for improving the memory parallelism of pointer-based programs. CASH uses an SSA-b...
This paper describes transformation techniques for out-of-core programs (i.e., those that deal with very large quantities of data) based on exploiting locality using a combination...
Mahmut T. Kandemir, Alok N. Choudhary, J. Ramanuja...
— In this paper we give a theoretical model for determining the synchronization frequency that minimizes the parallel execution time of loops with uniform dependencies dynamicall...
Florina M. Ciorba, Ioannis Riakiotakis, Theodore A...
Abstract. Embedded systems have strict timing and code size requirements. Retiming is one of the most important optimization techniques to improve the execution time of loops by in...
Chun Xue, Zili Shao, Meilin Liu, Mei Kang Qiu, Edw...
When Java was first introduced, there was a perception that its many benefits came at a significant performance cost. In the particularly performance-sensitive field of numerical ...