Network performance in tightly-coupled multiprocessors typically degrades rapidly beyond network saturation. Consequently, designers must keep a network below its saturation point...
Mithuna Thottethodi, Alvin R. Lebeck, Shubhendu S....
We propose methods for reducing the energy consumed by snoop requests in snoopy bus-based symmetric multiprocessor (SMP) systems. Observing that a large fraction of snoops do not ...
Andreas Moshovos, Gokhan Memik, Babak Falsafi, Alo...
The performance of out-of-order processors increases with the instruction window size. In conventional processors, the effective instruction window cannot be larger than the issue...
Mispredicted branches and loads that miss in the cache cause the majority of retirement stalls experienced by sequential processors; we call these critical instructions. Despite t...
In this papel; we address the severe performance gap caused by high processor clock rates and slow DRAM accesses. We show that even with an aggressive, next-generation memory syst...
This paper presents an algorithm to automatically map code on a generic intelligent memory system that consists of a host processor and a simpler memory processor. To achieve high...
This paper introduces a router delay model that accurately models key aspects of modern routers. The model accounts for the pipelined nature of contemporary routers, the specific ...
Clustered ILP processors are characterized by a large number of non-centralized on-chip resources grouped into clusters. Traditional code generation schemes for these processors c...
Krishnan Kailas, Kemal Ebcioglu, Ashok K. Agrawala