In this paper, we describe an algorithm and implementation of locality optimizations for architectures with instruction sets such as Intel’s SSE and Motorola’s AltiVec that su...
In this paper, we present a novel technique to reduce dynamic and static power dissipation in the issue queue. The proposed scheme is based on delaying the dispatch of instructions...
Register files are in the critical path of most high-performance processors and their latency is one of the most important factors that limit their size. Our goal is to develop er...
Gokhan Memik, Masud H. Chowdhury, Arindam Mallik, ...
Many current programmable architectures designed to exploit data parallelism require computation to be structured to operate on sequentially accessed vectors or streams of data. A...
Nuwan Jayasena, Mattan Erez, Jung Ho Ahn, William ...
This paper describes the outcomes of the NSF Grant CNS-0615085: CSR-EHS: Enhancing the Effectiveness of Utilizing an Instruction Register File. We improved promoting instructions ...