In embedded system-on-a-chip (SoC) applications, the need for integrating heterogeneous processors in a single chip is increasing. An important issue in integrating heterogeneous processors is how to maintain the coherence of data caches. In this paper, we propose a hardware/software methodology to make caches coherent in heterogeneous multiprocessor platforms with shared memory. Our approach works with any combination of processors that support any invalidation-based protocol. As shown in our simulations, up to 38% speedup can be achieved with a 13-cycle miss penalty at the expense of simple hardware, compared to a pure software solution. Speedup can be improved even further as the miss penalty increases. In addition, our approach provides embedded system programmers a transparent view of shared data, removing the burden of software synchronization.
Taeweon Suh, Douglas M. Blough, Hsien-Hsin S. Lee