Abstract. We present new performance models and a new, more compact data structure for cache blocking when applied to the sparse matrixvector multiply (SpM×V) operation, y ← y +...
Rajesh Nishtala, Richard W. Vuduc, James Demmel, K...
Cache-Content-Duplication (CCD) occurs when there is a miss for a block in a cache and the entire content of the missed block is already in the cache in a block with a different t...
Cache misses form a major bottleneck for memory-intensive applications, due to the significant latency of main memory accesses. Loop tiling, in conjunction with other program tran...
While many block replacement algorithms for buffer caches have been proposed to address the wellknown drawbacks of the LRU algorithm, they are not robust and cannot maintain a cons...
Cache blocks often exhibit a small number of uses during their life time in the last-level cache. Past research has exploited this property in two different ways. First, replacem...