This paper proposes and studies a hardware-based adaptive controlled migration strategy for managing distributed L2 caches in chip multiprocessors. Building on an area-efficient shared cache design, the proposed scheme dynamically migrates cache blocks to cache banks that best minimize the average L2 access latency. Cache blocks are continuously monitored and the locations of the optimal corresponding cache banks are predicted to effectively alleviate the impact of non-uniform cache access latency. By adopting migration alone without replication, the exclusiveness of cache blocks is maintained, thus further optimizing the cache miss rate. Simulation results using a full system simulator demonstrate that the proposed controlled migration scheme outperforms the shared caching strategy and compares favorably with previously proposed replication schemes.
Mohammad Hammoud, Sangyeun Cho, Rami G. Melhem