Convergent scheduling is a general framework for instruction scheduling and cluster assignment for parallel, clustered architectures. A convergent scheduler is composed of many ind...
Walter Lee, Diego Puppin, Shane Swenson, Saman P. ...
On-chip caches represent a sizeable fraction of the total power consumption of microprocessors. Although large caches can significantly improve performance, they have the potentia...
Power consumption and power density for the Translation Lookaside Buffer (TLB) are important considerations not only in its design, but can have a consequence on cache design as w...
Ismail Kadayif, Anand Sivasubramaniam, Mahmut T. K...
Research has shown that precomputation microthreads can be useful for improving branch prediction and prefetching. However, it is not obvious how to provide the necessary microarc...
Robert S. Chappell, Francis Tseng, Adi Yoaz, Yale ...
This paper introduces the Simulation Architecture Description Language (SADL) developed at the National Aeronautics and Space Administration's Marshall Space Flight Center to...
In this paper, we consider the tree task graphs which arise from many important programming paradigms such as divide and conquer, branch and bound etc., and the linear task-graphs...
Traditional vector architectures have shown to be very effective for regular codes where the compiler can detect data-level parallelism. However, this SIMD parallelism is also pre...
ion layer3,4 hides hardware particulars from the higher levels of software but can also compromise performance and compatibility; the higher levels of software often make unwitting...
Previous studies measuring the performance of general-purpose operating systems running large-scale Internet server applications, such as proxy caches, have identified design defi...