As we enter the era of petascale computing, system architects must plan for machines composed of tens or even hundreds of thousands of processors. Although fully connected network...
Shoaib Kamil, Ali Pinar, Daniel Gunter, Michael Li...
Performance tuning is an important and time consuming task which may have to be repeated for each new application and platform. Although iterative optimisation can automate this p...
Energy-efficient microprocessor designs are one of the major concerns in both high performance and embedded processor domains. Furthermore, as process technology advances toward d...
In this work we reduce interconnect power dissipation in Symmetric Multiprocessors or SMPs. We revisit snoopy cache coherence protocols and reduce unnecessary interconnect activit...
In this paper we describe a new technique, called pipeline spectroscopy, and use it to measure the cost of each cache miss. The cost of a miss is displayed (graphed) as a histogra...
Thomas R. Puzak, Allan Hartstein, Philip G. Emma, ...