Previous simulators for shared-memory architectures have imposed a large tradeoff between simulation accuracy and speed. Most such simulators model simple processors that do not e...
In this paper we present an exhaustive evaluation of the memory subsystem in a chip-multiprocessor (CMP) architecture composed of 16 cores. The characterization is performed making...
Current microprocessors aggressively exploit instructionlevel parallelism (ILP) through techniques such as multiple issue, dynamic scheduling, and non-blocking reads. Recent work ...
Parthasarathy Ranganathan, Vijay S. Pai, Hazim Abd...
Distributed shared memory systems have become popular as a means of utilizing clusters of computers for solving large applications. We have developed a high-performance DSM at Way...