We revisit memory hierarchy design viewing memory as an inter-operation communication agent. This perspective leads to the development of novel methods of performing inter-operati...
Cost and power consumption are two of the most important design factors for many embedded systems, particularly consumer devices. Products such as Personal Digital Assistants, pag...
Darko Kirovski, Johnson Kin, William H. Mangione-S...
Most modern microprocessors employ one or two levels of on-chip caches in order to improve performance. These caches are typically implemented with static RAM cells and often occu...
Johnson Kin, Munish Gupta, William H. Mangione-Smi...
In this paper we investigate the behavior of data prefetching on an access decoupled machine and a superscalar machine. We assess if there are bene ts to using the decoupling para...
As the disparity between processor and main memory performance grows, the number of execution cycles spent waiting for memory accesses to complete also increases. As a result, lat...
Teresa L. Johnson, Matthew C. Merten, Wen-mei W. H...
The multicluster architecture that we introduce offers a decentralized, dynamically-scheduled architecture, in which the register files, dispatch queue, and functional units of t...
Keith I. Farkas, Paul Chow, Norman P. Jouppi, Zvon...
Modern architectural trends in instruction-level parallelism (ILP) are to increase the computational power of microprocessors significantly. As a result, the demands on memory ha...
Predicated execution is a promising architectural feature for exploiting instruction-level parallelism in the presence of control flow. Compiling for predicated execution involve...