As multiprocessor systems-on-chip become a reality, performance modeling becomes a challenge. To quickly evaluate many architectures, some type of high-level simulation is required, including high-level cache simulation. We propose to perform this cache simulation by defining a metric to represent memory behavior independently of cache structure and back-annotate this into the original application. While the annotation phase is complex, requiring time comparable to normal address trace based simulation, it need only be performed once per application set and thus enables simulation to be sped up by a factor of 20 to 50 over trace based simulation. This is important for embedded systems, as software is often evaluated against many input sets and many architectures. Our results show the technique is accurate to within 20% of miss rate for uniprocessors and was able to reduce the die area of a multiprocessor chip by a projected 14% over a naive design by accurately sizing caches for each ...
Joshua J. Pieper, Alain Mellan, JoAnn M. Paul, Don