Bus-based shared memory multiprocessors with private caches and snooping write-invalidate cache coherence protocols are dominant form of small- to medium-scale parallel machines today. In these systems the high memory latency poses the major hurdle in achieving high performance. One way to cope with this problem is to use various techniques for tolerating high memory latency. Software-controlled cache prefetching and data forwarding are two widely used techniques for tolerating high memory latency in scalable cache-coherent shared memory multiprocessors. However, some previous studies have shown that these techniques are not so effective in bus-based shared memory multiprocessors. In this paper, we propose a novel software-controlled technique called cache injection, which combines consumer and producer initiated approach, and broadcasting nature of bus. Performance evaluation based on program-driven simulation and a set of scientific applications and test benchmarks shows that cache ...
Aleksandar Milenkovic, Veljko M. Milutinovic