In this paper we present an approach to boosting performance and tolerating latency by deferring non-critical instructions into a deferred queue for later processing. As such, ins...
Gregory A. Muthler, David Crowe, Sanjay J. Patel, ...
We propose a scheme for transient-fault recovery called Simultaneously and Redundantly Threaded processors with Recovery (SRTR) that enhances a previously proposed scheme for tran...
An increasingly large portion of scheduler latency is derived from the monolithic content addressable memory (CAM) arrays accessed during instruction wakeup. The performance of th...
Modern microprocessors schedule instructions dynamically in order to exploit instruction-level parallelism. It is necessary to increase instruction window size for improving instr...
Instruction set simulators are critical tools for the exploration and validation of new programmable architectures. Due to increasing complexity of the architectures and timeto-ma...
On a N-way issue superscalar processor, the front end instruction fetch engine must deliver instructions to the execution core at a sustained rate higher than N instructions per c...
To achieve high instruction throughput, instruction schedulers must be capable of producing high-quality schedules that maximize functional unit utilization while at the same time...
: A modern special-purpose processor (e.g., for image and graphical applications) usually contains a set of instructions supporting complex multiply-operations. These instructions ...
Alex C.-Y. Chang, Wu-An Kuo, Allen C.-H. Wu, TingT...
Current processors exploit out-of-order execution and branch prediction to improve instruction level parallelism. When a branch prediction is wrong, processors flush the pipeline ...
VLIW architectures are popular in embedded systems because they offer high-performance processing at low cost and energy. The major problem with traditional VLIW designs is that t...
Hongtao Zhong, Kevin Fan, Scott A. Mahlke, Michael...