The power consumption of Dynamically Reconfigurable Processing Array (DRPA) is quantitatively analyzed by using a real chip layout and applications taking into account the reconfi...
Exploiting parallelism at both the multiprocessor level and the instruction level is an e ective means for supercomputers to achieve high-performance. The amount of instruction-le...
Scott A. Mahlke, William Y. Chen, John C. Gyllenha...
Software systems typically exploit only a small fraction of the realizable performance from the underlying microprocessors. While there has been much work on hardware-aware optimiz...
Dan Knights, Todd Mytkowicz, Peter F. Sweeney, Mic...
: Since the era of vector and pipelined computing, the computational speed is limited by the memory access time. Faster caches and more cache levels are used to bridge the growing ...
† In this paper, we propose a parallel randomized algorithm, called Parallel Fast Assignment using Search Technique (PFAST), for scheduling parallel programs represented by direc...