We propose a method for compressing programs in embedded processors where instruction memory size dominates cost. A post-compilation analyzer examines a program and replaces commo...
Charles Lefurgy, Peter L. Bird, I-Cheng K. Chen, T...
Superscalar processors currently have the potential to fetch multiple basic blocks per cycle by employing one of several recently proposed instruction fetch mechanisms. However, t...
For maximum performance, an out-of-order processor must issue load instructions as early as possible, while avoiding memory-order violations with prior store instructions that wri...
Conventional speculative architectures use branch prediction to evaluate the most likely execution path during program execution. However, certain branches are difficult to predic...
Artur Klauser, Todd M. Austin, Dirk Grunwald, Brad...
With the increased clock frequency of modern, high-performance processors over 500 MHz, in some cases, limiting the power dissipation has become the most stringent design target. ...
Luca Benini, Giovanni De Micheli, Alberto Macii, E...
This paper describes a mechanism for automatic design and synthesis of very long instruction word (VLIW), and its generalization, explicitly parallel instruction computing rocesso...
Processors that can simultaneously execute multiple paths of execution will only exacerbate the fetch bandwidth problem already plaguing conventional processors. On a multiple-pat...
Abstract. Proof-carrying code and other applications in computer security require machine-checkable proofs of properties of machine-language programs. These in turn require axioms ...
In this paper, we evaluate a mechanism to reissue instructions on the mispredicted speculation path. An instruction which is once dispatched to a functional unit during mispredict...
Permutation is widely used in cryptographic algorithms. However, it is not well-supported in existing instruction sets. In this paper, two instructions, PPERM3R and GRP, are propo...