Superscalar processors currently have the potential to fetch multiple basic blocks per cycle by employing one of several recently proposed instruction fetch mechanisms. However, t...
DataScalar architectures improve memory system performance by running computation redundantly across multiple processors, which are each tightly coupled with an associated memory....
This work provides a systematic study of the impact of communication performance on parallelapplications in a high performance network of workstations. We develop an experimental ...
Richard P. Martin, Amin Vahdat, David E. Culler, T...
The performance tradeoff between hardware complexity and clock speed is studied. First, a generic superscalar pipeline is defined. Then the specific areas of register renaming, ...
Subbarao Palacharla, Norman P. Jouppi, James E. Sm...
Portable systems demand energy efficiency in order to maximize battery life. IRAM architectures, which combine DRAM and a processor on the same chip in a DRAM process, are more en...
Richard Fromm, Stylianos Perissakis, Neal Cardwell...