Sciweavers

ASPLOS
1991
ACM

The Cache Performance and Optimizations of Blocked Algorithms

14 years 4 months ago
The Cache Performance and Optimizations of Blocked Algorithms
Blocking is a well-known optimization technique for improving the effectiveness of memory hierarchies. Instead of operating on entire rows or columns of an array, blocked algorithms operate on submatrices or blocks, so that data loaded into the faster levels of the memory hierarchy are reused. This paper presents cache performance data for blocked programs and evaluates several optimizations to improve this performance. The data is obtained by a theoretical model of data conflicts in the cache, which has been validated by large amounts of simulation. We show that the degree of cache interference is highly sensitive to the stride of data accesses and the size of the blocks, and can cause wide variations in machine performance for different matrix sizes. The conventional wisdom of trying to use the entire cache, or even a fixed fraction of the cache, is incorrect. If a fixed block size is used for a given cache size, the block size that minimizes the expected number of cache misses i...
Monica S. Lam, Edward E. Rothberg, Michael E. Wolf
Added 27 Aug 2010
Updated 27 Aug 2010
Type Conference
Year 1991
Where ASPLOS
Authors Monica S. Lam, Edward E. Rothberg, Michael E. Wolf
Comments (0)