The performance of superscalar processors is more sensitive to the memory system delay than their single-issue predecessors. This paper examines alternative data access microarchitectures that e ectively support compilerassisted data prefetching in superscalar processors. In particular, a prefetch bu er is shown to be more e ective than increasing the cache dimension in solving the cache pollution problem. All in all, we show that a small data cache with compiler-assisted data prefetching can achieve a performance level close to that of an ideal cache.
William Y. Chen, Scott A. Mahlke, Pohua P. Chang,