Most large shared-memory multiprocessors use directory protocols to keep per-processor caches coherent. Some memory references in such systems, however, suffer long latencies for ...
Microprocessors and memory systems su er from a growing gap in performance. We introduce Active Pages, a computation model which addresses this gap by shifting data-intensive comp...
This paper examines the performance of desktop applications running on the Microsoft Windows NT operating system on Intel x86 processors, and contrasts these applications to the p...
Dennis C. Lee, Patrick Crowley, Jean-Loup Baer, Th...
Branch prediction has enabled microprocessors to increase instruction level parallelism (ILP) by allowing programs to speculatively execute beyond control boundaries. Although spe...
Modern cache designs exploit spatial locality by fetching large blocks of data called cache lines on a cache miss. Subsequent references to words within the same cache line result...
In the past decade. there has been much literature describing various cache organizatrons that exploit general programming idiosyncrasies to obtain maxrmum hit rate (the probabili...