We describe the design, analysis, and performance of an on–line algorithm to dynamically control the frequency/voltage of a Multiple Clock Domain (MCD) microarchitecture. The MC...
Greg Semeraro, David H. Albonesi, Steve Dropsho, G...
Trace caches enable high bandwidth, low latency instruction supply, but have a high miss penalty and relatively large working sets. Consequently, their performance may suffer due ...
Modern CPUs have instructions that allow basic operations to be performed on several data elements in parallel. These instructions are called SIMD instructions, since they apply a...
— This paper describes the mapping of a recently introduced template matching algorithm based on the Normalized Cross Correlation (NCC) on a general purpose processor endowed wit...
Luigi di Stefano, Stefano Mattoccia, Federico Tomb...
Instruction combining is an optimization to replace a sequence of instructions with a more efficient instruction yielding the same result in a fewer machine cycles. When we use it...