Sciweavers

115 search results - page 3 / 23
» Fusion of Loops for Parallelism and Locality
Sort
View
HPCC
2007
Springer
14 years 1 months ago
High Performance FFT on SGI Altix 3700
We have developed a high-performance FFT on SGI Altix 3700, improving the efficiency of the floating-point operations required to compute FFT by using a kind of loop fusion techni...
Akira Nukada, Daisuke Takahashi, Reiji Suda, Akira...
SC
2005
ACM
14 years 17 days ago
Integrated Loop Optimizations for Data Locality Enhancement of Tensor Contraction Expressions
A very challenging issue for optimizing compilers is the phase ordering problem: In what order should a collection of compiler optimizations be performed? We address this problem ...
Swarup Kumar Sahoo, Sriram Krishnamoorthy, Rajkira...
IPPS
2006
IEEE
14 years 1 months ago
Memory minimization for tensor contractions using integer linear programming
This paper presents a technique for memory optimization for a class of computations that arises in the field of correlated electronic structure methods such as coupled cluster and...
A. Allam, J. Ramanujam, Gerald Baumgartner, P. Sad...
IEEEPACT
1998
IEEE
13 years 11 months ago
A Matrix-Based Approach to the Global Locality Optimization Problem
Global locality analysis is a technique for improving the cache performance of a sequence of loop nests through a combination of loop and data layout optimizations. Pure loop tran...
Mahmut T. Kandemir, Alok N. Choudhary, J. Ramanuja...
PLDI
1993
ACM
13 years 11 months ago
Global Optimizations for Parallelism and Locality on Scalable Parallel Machines
Data locality is critical to achievinghigh performance on large-scale parallel machines. Non-local data accesses result in communication that can greatly impact performance. Thus ...
Jennifer-Ann M. Anderson, Monica S. Lam