This paper describes Compiler-Directed Content-Aware Prefetching (CDCAP), an integrated compiler and hardware approach for prefetching dynamic data structures. The approach utiliz...
Cache optimizations typically include code transformations to increase the locality of memory accesses. An orthogonal approach is to enable for latency hiding by introducing prefet...
This paper proposes a new hardware technique for using one core of a CMP to prefetch data for a thread running on another core. Our approach simply executes a copy of all non-cont...
In this paper, we study the instruction cache miss behavior of four modern commercial applications (a database workload, TPC-W, SPECjAppServer2002 and SPECweb99). These applicatio...
Stride prefetching is recognized as an important technique to improve memory access performance. The prior work usually profiles and/or analyzes the program behavior offline, and u...