Sciweavers

59 search results - page 4 / 12
» Improving Locality in Out-of-Core Computations Using Data La...
Sort
View
ISHPC
2003
Springer
14 years 22 days ago
Code and Data Transformations for Improving Shared Cache Performance on SMT Processors
Simultaneous multithreaded processors use shared on-chip caches, which yield better cost-performance ratios. Sharing a cache between simultaneously executing threads causes excessi...
Dimitrios S. Nikolopoulos
ICS
1999
Tsinghua U.
13 years 11 months ago
Nonlinear array layouts for hierarchical memory systems
Programming languages that provide multidimensional arrays and a flat linear model of memory must implement a mapping between these two domains to order array elements in memory....
Siddhartha Chatterjee, Vibhor V. Jain, Alvin R. Le...
PLDI
1995
ACM
13 years 11 months ago
Unifying Data and Control Transformations for Distributed Shared Memory Machines
We present a unified approach to locality optimization that employs both data and control transformations. Data transformations include changing the array layout in memory. Contr...
Michal Cierniak, Wei Li
ASPLOS
1994
ACM
13 years 11 months ago
Compiler Optimizations for Improving Data Locality
In the past decade, processor speed has become significantly faster than memory speed. Small, fast cache memories are designed to overcome this discrepancy, but they are only effe...
Steve Carr, Kathryn S. McKinley, Chau-Wen Tseng
IPPS
2000
IEEE
13 years 12 months ago
Dynamic Data Layouts for Cache-Conscious Factorization of DFT
Effective utilization of cache memories is a key factor in achieving high performance in computing the Discrete Fourier Transform (DFT). Most optimizationtechniques for computing ...
Neungsoo Park, Dongsoo Kang, Kiran Bondalapati, Vi...