This paper presents CHeckpointed Early Resource RecYcling (Cherry), a hybrid mode of execution based on ROB and checkpointing that decouples resource recycling and instruction ret...
Register integration (or just integration) is a register renaming discipline that implements instruction reuse via physical register sharing. Initially developed to perform squash...
The key issues for register file design in high-performance processors are access time and energy. While previous work has focused on reducing the number of registers, we propose...
In this paper we present an approach to boosting performance and tolerating latency by deferring non-critical instructions into a deferred queue for later processing. As such, ins...
Gregory A. Muthler, David Crowe, Sanjay J. Patel, ...
Instruction cache aware compilation seeks to lay out a program in memory in such a way that cache conflicts between procedures are minimized. It does this through profile-driven...
Multimedia processing on embedded devices requires an architecture that leads to high performance, low power consumption, reduced design complexity, and small code size. In this p...
Simultaneous multithreading seeks to improve the aggregate computation bandwidth of a processor core by sharing resources such as functional units, caches, TLB and so on. To date,...
Clustering is a common technique to overcome the wire delay problem incurred by the evolution of technology. Fully-distributed architectures, where the register file, the functio...