This paper describes dynamic pressure-aware associative placement (DPAP), a novel distributed cache management scheme for large-scale chip multiprocessors. Our work is motivated by the large non-uniform distribution of memory accesses across cache sets in different L2 banks. DPAP decouples the physical locations of cache blocks from their addresses for the sake of reducing misses caused by destructive interferences. Temporal pressure at the on-chip last-level cache, is continuously collected at a group (comprised of local cache sets) granularity, and periodically recorded at the memory controller(s) to guide the placement process. An incoming block is consequently placed at a cache group that exhibits the minimum pressure. Simulation results using a full-system simulator demonstrate that DPAP outperforms the baseline shared NUCA scheme by an average of 8.3% and by as much as 18.9% for the benchmark programs we examined. Furthermore, evaluations showed that DPAP outperforms related cach...
Mohammad Hammoud, Sangyeun Cho, Rami G. Melhem