The size and speed of first-level caches and SRAMs of embedded processors continue to increase in response to demands for higher performance. In power-sensitive devices like PDAs a...
In this paper we study the performance improvements and trade-offs derived from an optimized mapping approach applied on a parametric coarse grained reconfigurable array architect...
Grigoris Dimitroulakos, Michalis D. Galanis, Const...
We initiate the study of sublinear-time algorithms in the external memory model [Vit01]. In this model, the data is stored in blocks of a certain size B, and the algorithm is char...
Alexandr Andoni, Piotr Indyk, Krzysztof Onak, Roni...
Several recent papers have proposed or analyzed optimal algorithms to route all-to-all personalizedcommunication (AAPC) over communication networks such as meshes, hypercubes and ...
The cache hierarchy design in existing SMT and superscalar processors is optimized for latency, but not for bandwidth. The size of the L1 data cache did not scale over the past dec...