Since the onset of pipelined processors, balancing the delay of the microarchitectural pipeline stages such that each microarchitectural pipeline stage has an equal delay has been...
A growing body of work has compiled a strong case for the single-ISA heterogeneous multi-core paradigm. A single-ISA heterogeneous multi-core provides multiple, differently-design...
Niket Kumar Choudhary, Salil V. Wadhavkar, Tanmay ...
Neural-inspired branch predictors achieve very low branch misprediction rates. However, previously proposed implementations have a variety of characteristics that make them challen...
Although CMOS feature size scaling has been the source of dramatic performance gains, it has lead to mounting reliability concerns due to increasing power densities and on-chip te...
Shantanu Gupta, Shuguang Feng, Amin Ansari, Jason ...
Profile data is valuable for identifying performance bottlenecks and guiding optimizations. Periodic sampling of a processor's performance monitoring hardware is an effective...
Jeffrey Dean, James E. Hicks, Carl A. Waldspurger,...
Data, addresses, and instructions are compressed by maintaining only significant bytes with two or three extension bits appended to indicate the significant byte positions. This s...
Neural-inspired branch predictors achieve very low branch misprediction rates. However, previously proposed implementations have a variety of characteristics that make them challe...
Abstract—This paper presents a method to investigate powerperformance tradeoffs in digital pipelined designs. The method is applied at the architectural level of the design. It w...
Internet Protocol (IP) lookup in routers can be implemented by some form of tree traversal. Pipelining can dramatically improve the search throughput. However, it results in unbal...
In this paper, we describe a set of compiler analyses and an implementation that automatically map a sequential and un-annotated C program into a pipelined implementation, targete...