In this paper, we describe an algorithm and implementation of locality optimizations for architectures with instruction sets such as Intel’s SSE and Motorola’s AltiVec that su...
Abstract. We present a hardware mechanism which dynamically detects uniform and affine vectors used in Graphics Processing Units, to minimize pressure on the register file and redu...
In recent years, processor manufacturers have converged on two types of register file architectures. Both IBM with its POWER series and Intel with its Pentium series are using a ...
The compiler is generally regarded as the most important software component that supports a processor design to achieve success. This paper describes our application of the open re...
Current-generation microprocessors are designed to process instructions with one and two source operands at equal cost. Handling two source operands requires multiple ports for ea...