Current on-chip block-centric memory hierarchies exploit access patterns at the fine-grain scale of small blocks. Several recently proposed techniques for coherence traffic reduct...
In this paper we describe the design and implementation of a flexible, and extensible, just-in-time ARM simulator designed to run co-operatively with a multi-core DSP simulator on...
Reconfigurable instruction set processors have the capability to adapt their instruction sets to the application being executed through a reconfiguration in their hardware. Throug...
Dynamic data distributions offer a number of performance benefits, but require more sophisticated compiler support and incur run-time overhead. We investigate attainable benefits ...
Recent superscalar processors issue four instructions per cycle. These processors are also powered by highly-parallel superscalar cores. The potential performance can only be expl...
Thomas M. Conte, Kishore N. Menezes, Patrick M. Mi...