Clustered microarchitectures are an effective approach to reducing the penalties caused by wire delays inside a chip. Current superscalar processors have in fact a two-cluster mic...
Ramon Canal, Joan-Manuel Parcerisa, Antonio Gonz&a...
Abstract—One of the ways that custom instruction set extensions can improve over software execution is through the use of hardware structures that have been optimized at the arit...
We consider the problem of designing scheduling algorithms for the downlink of cellular wireless networks where bandwidth is partitioned into tens to hundreds of parallel channels...
Shreeshankar Bodas, Sanjay Shakkottai, Lei Ying, R...
Processor architectures with tens to hundreds of arithmetic units are emerging to handle media processing applications. These applications, such as image coding, image synthesis, ...
Scott Rixner, William J. Dally, Brucek Khailany, P...
We pose the question: how do we efficiently evaluate a join operator, distributed over a heterogeneous network? Our objective here is to optimize the delay of output tuples. We di...