Current on-chip block-centric memory hierarchies exploit access patterns at the fine-grain scale of small blocks. Several recently proposed techniques for coherence traffic reduction and prefetching suggest that further useful patterns emerge with a macroscopic, coarse-grain view. To exploit coarsegrain behavior, previous work extended conventional caches with additional coarse-grain tracking and management structures considerably increasing overall cost and complexity. This paper demonstrates that as multi-megabyte caches have become commonplace, coarse-grain tracking and management no longer needs to be an afterthought. This functionality comes “for free” via RegionTracker. RegionTracker is a dual-grain cache design that maintains block-level communication while directly supporting coarse-grain tracking and management. Compared to a block-centric conventional cache of the same data capacity, RegionTracker requires less area to achieve a nearly identical miss rate (within 1%). Re...