3D integration technology greatly increases transistor density while providing faster on-chip communication. 3D implementations of processors can simultaneously provide both laten...
Hard-to-predict branches depending on longlatency cache-misses have been recognized as a major performance obstacle for modern microprocessors. With the widening speed gap between...
Hongliang Gao, Yi Ma, Martin Dimitrov, Huiyang Zho...
This paper describes a scalable, low-complexity alternative to the conventional load/store queue (LSQ) for superscalar processors that execute load and store instructions speculat...
Every algebraic theory gives rise to a monad, and monads allow a meta-language which is a basic programming language with sideeffects. Equations in the algebraic theory give rise ...
As processors continue to exploit more instruction level parallelism, a greater demand is placed on reducing the e ects of memory access latency. In this paper, we introduce a nov...