Abstract—In order to harness the full compute power of manycore processors, future designs must focus on effective utilization of on-chip cache and bandwidth resources. In this paper, we address the dual goals of (1) reducing on-chip communication overheads and (2) improving on-chip cache space utilization resulting in larger effective cache capacity and thereby potentially reduced off-chip traffic. We present a new cache coherence protocol that decouples the logical binding between data and metadata in a cache set. This decoupling allows data and metadata for a cache line to be independently delegated to any location on chip. By delegating metadata to the current owner/modifier of a cache line, communication overhead for metadata maintenance is avoided and communication can be effectively localized between interacting processes. By decoupling metadata from data, data space in the cache can be more efficiently utilized by avoiding unnecessary data replication. Using full system si...
Hemayet Hossain, Sandhya Dwarkadas, Michael C. Hua