Many sorting algorithms have been studied in the past, but there are only a few algorithms that can effectively exploit both SIMD instructions and threadlevel parallelism. In this...
Large instruction window processors achieve high performance by exposing large amounts of instruction level parallelism. However, accessing large hardware structures typically req...
Haitham Akkary, Ravi Rajwar, Srikanth T. Srinivasa...
We propose a method for compressing programs in embedded processors where instruction memory size dominates cost. A post-compilation analyzer examines a program and replaces commo...
Charles Lefurgy, Peter L. Bird, I-Cheng K. Chen, T...
A design process is presented for the selection of a set of instruction set extensions for the PowerPC 405 processor that is embedded into the Xilinx Virtex Family of FPGAs. The i...
Brian F. Veale, John K. Antonio, Monte P. Tull, S....
With the increased clock frequency of modern, high-performance processors over 500 MHz, in some cases, limiting the power dissipation has become the most stringent design target. ...
Luca Benini, Giovanni De Micheli, Alberto Macii, E...