This paper introduces Way Stealing, a simple architectural modification to a cache-based processor to increase data bandwidth to and from application-specific Instruction Set Exte...
Theo Kluter, Philip Brisk, Paolo Ienne, Edoardo Ch...
level of abstraction, compared with the program representation for scalar optimizations. For example, loop unrolling and loop unrolland-jam transformations exploit the large regist...
Rakesh Krishnaiyer, Dattatraya Kulkarni, Daniel M....
The register file is one of the most critical datapath components limiting the number of threads that can be supported on a Simultaneous Multithreading (SMT) processor. To allow t...
Several manufacturers have recently announced the first simultaneous-multithreaded processors, both as single CPUs and as components of multi-CPU chips. All are small scale, compr...
This paper presents the enhancement of an ASIP’s floating point performance by coupling of a co-processor and adding of special instructions. Processor hardware modifications an...