On machines with high-performance processors, the memory system continues to be a performance bottleneck. Compilers insert prefetch operations and reorder data accesses to improve...
Nathaniel McIntosh, Sandya Mannarswamy, Robert Hun...
Abstract. The increased popularity of grid systems and cycle sharing across organizations requires scalable systems that provide facilities to locate resources, to be fair in the u...
Shuo Yang, Ali Raza Butt, Y. Charlie Hu, Samuel P....
This paper describes the design and evaluation of an auto-memoization processor. The major point of this proposal is to detect the multilevel functions and loops with no additiona...
Convergent scheduling is a general framework for instruction scheduling and cluster assignment for parallel, clustered architectures. A convergent scheduler is composed of many ind...
Walter Lee, Diego Puppin, Shane Swenson, Saman P. ...
Memory-intensive threads can hoard shared resources without making progress on a multithreading processor (SMT), thereby hindering the overall system performance. A recent promisi...