We discuss some of the difficulties present in trace collection and trace-driven cache simulation. We then describe our multiprocessor tracing technique and verify that it accurately collects long traces. We propose sampling as a method to reduce required disk space, enable simulations to run faster, and effectively enlarge the trace buffer of our hardware monitor, decreasing trace distortion. To this end, we investigate time sampling and two types of set sampling. We conclude that the second set sampling technique achieves the most accurate results. The miss rate for the second set sampling method is calculated as the number of misses to sampled sets divided by the total number of references scaled by the sample size. We determined that a 10% sample size was the most accurate while still reducing required disk space.
Niki C. Thornock, J. Kelly Flanagan