The prominent role of the memory hierarchy as one of the major bottlenecks in achieving good program performance has motivated the search for ways of capturing the memory performance of an application/machine pair that is both practical in terms of time and space, yet detailed enough to gain useful and relevant information. The strategy that we endorse periodically samples events during program execution, producing an event trace that is both manageable and informative. As demonstrated, adopting this strategy, a diverse set of performance issues can be studied using the same set of traces. For example, using one set of traces and our performance evaluation framework, memory access performance, process migration, compulsory and conflict misses, and false sharing can be characterized.
Ricardo Portillo, Diana Villa, Patricia J. Teller,