This paper proposes DCC (Dynamic Cache Clustering), a novel distributed cache management scheme for large-scale chip multiprocessors. Using DCC, a per-core cache cluster is comprised of a number of L2 cache banks and cache clusters are constructed, expanded, and contracted dynamically to match each core’s cache demand. The basic trade-offs of varying the on-chip cache clusters are average L2 access latency and L2 miss rate. DCC uniquely and efficiently optimizes both metrics and continuously tracks a near-optimal cache organization from many possible configurations. Simulation results using a full-system simulator demonstrate that DCC outperforms alternative L2 cache designs. Categories and Subject Descriptors C.0 [Computer Systems Organization]: System architectures General Terms Design, Management, Experimentation, Performance Keywords Chip Multiprocessor (CMP), Non-Uniform Cache Architecture (NUCA)
Mohammad Hammoud, Sangyeun Cho, Rami G. Melhem