—Efficient sharing of system resources is critical to obtaining high utilization and enforcing system-level performance objectives on chip multiprocessors (CMPs). Although sever...
Shared memory multiprocessor systems typically provide a set of hardware primitives in order to support synchronization. Generally, they provide single-word read-modify-write hard...
This paper describes the synchronization and communication primitives of the Cray T3E multiprocessor, a shared memory system scalable to 2048 processors. We discuss what we have l...
The emphasis in microprocessor design has shifted from high performance, to a combination of high performance and low power. Until recently, this trend was mostly true for uniproc...
Practical cache replacement policies attempt to emulate optimal replacement by predicting the re-reference interval of a cache block. The commonly used LRU replacement policy alwa...
Aamer Jaleel, Kevin B. Theobald, Simon C. Steely J...