Operand gating is a technique for improving processor energy efficiency by gating off sections of the data path that are unneeded by short-precision (narrow) operands. A method fo...
In this paper, we describe a generalized approach to deriving a custom data layout in multiple memory banks for array-based computations, to facilitate high-bandwidth parallel mem...
Predicated execution enables the removal of branches by converting segments of branching code into sequences of conditional operations. An important side effect of this transforma...
Mikhail Smelyanskiy, Scott A. Mahlke, Edward S. Da...
Traditionally, software pipelining is applied either to the innermost loop of a given loop nest or from the innermost loop to outer loops. In this paper, we propose a threestep ap...
Traditionally, software pipelining is applied either to the innermost loop of a given loop nest or from the innermost loop to the outer loops. In a companion paper, we proposed a ...
Hongbo Rong, Alban Douillet, Ramaswamy Govindaraja...
Static Single Assignment form is an intermediate representation that uses instructions to merge values at each confluent point of the control flow graph. instructions are not ma...
The effective use of processor caches is crucial to the performance of applications. It has been shown that cache misses are not evenly distributed throughout a program. In applic...
We study several major characteristics of dynamic optimization within the PARROT power-aware, trace-cachebased microarchitectural framework. We investigate the benefit of providin...
Yoav Almog, Roni Rosner, Naftali Schwartz, Ari Sch...