The widening gap between CPU and memory speed has made caches an integral feature of modern highperformance processors. The high degree of configurability of cache memory can require extensive design space exploration and is generally performed using execution-driven or trace-driven simulation. Execution-driven simulators can be highly accurate but require a detailed development flow and may impose performance costs. Trace-driven simulators are an efficient alternative but maintaining large traces can present storage and portability problems. We propose a distribution-driven trace generation methodology as an alternative to traditional executionand trace- driven simulation. An adaptation of the Least Recently Used Stack Model is used to concisely capture the key locality features in a trace and a twostate Markov chain model is used for trace generation. Simulation and analysis of a variety of embedded application traces demonstrate the cacheability characteristics of the synthetic tra...
Rahman Hassan, Antony Harris, Nigel P. Topham, Ari