This paper explores statistical simulation as a fast simulation technique for driving chip multiprocessor (CMP) design space exploration. The idea of statistical simulation is to measure a number of important program execution characteristics, generate a synthetic trace, and simulate that synthetic trace. The important benefit is that a synthetic trace is very small compared to real program traces. This paper advances statistical simulation by modeling shared resources, such as shared caches and off-chip bandwidth. This is done (i) by collecting cache set access probabilities and per-set LRU stack depth profiles, and (ii) by modeling a program’s time-varying execution behavior in the synthetic trace. The key benefit is that the statistical profile is independent of a given cache configuration and the amount of multiprocessing, which enables statistical simulation to model conflict behavior in shared caches when multiple programs are co-executing on a CMP. We demonstrate that s...