Register files are in the critical path of most high-performance processors and their latency is one of the most important factors that limit their size. Our goal is to develop er...
Gokhan Memik, Masud H. Chowdhury, Arindam Mallik, ...
1 Table lookups are one of the most frequently-used operations in symmetric-key ciphers. Particularly in the newer algorithms such as the Advanced Encryption Standard (AES), we fr...
Cray X1 Fortran and C/C++ compilers provide a number of loop transformations, notably vectorization and multistreaming, in order to exploit the multistreaming processor (MSP) hard...
Leakage power is a major concern in current and future microprocessor designs. In this paper, we explore the potential of architectural techniques to reduce leakage through power-...
The cache hierarchy design in existing SMT and superscalar processors is optimized for latency, but not for bandwidth. The size of the L1 data cache did not scale over the past dec...