Recent studies on program execution behavior reveal that a large amount of execution time is spent in small frequently executed regions of code. Whereas adaptive cache management ...
Ravikrishnan Sree, Alex Settle, Ian Bratt, Daniel ...
Leakage power in cache memories represents a sizable fraction of total power consumption, and many techniques have been proposed to reduce it. As a matter of fact, during a fixed ...
The cache hierarchy design in existing SMT and superscalar processors is optimized for latency, but not for bandwidth. The size of the L1 data cache did not scale over the past dec...
Concurrent multithreaded architectures exploit both instruction-level and thread-level parallelism through a combination of branch prediction and thread-level control speculation. ...
Instruction delivery is a critical component for wide-issue processors since its bandwidth and accuracy place an upper limit on performance. The processor front-end accuracy and ba...