The XScale processor family provides user-controllable independent configuration of CPU, bus, and memory frequencies. This feature introduces another handle for the code optimizat...
We discuss the parallelization of algorithms for solving polynomial systems symbolically by way of triangular decompositions. We introduce a component-level parallelism for which ...
The run-time performance of VLIW (very long instruction word) microprocessors depends heavily on the effectiveness of its associated optimizing compiler. Typical VLIW compiler pha...
Set-associative caches are traditionally managed using hardwarebased lookup and replacement schemes that have high energy overheads. Ideally, the caching strategy should be tailor...
Rajiv A. Ravindran, Michael L. Chu, Scott A. Mahlk...
Abstract. We propose and discuss foundations for programmable overlay networks and overlay computing systems. Such overlays are built over a large number of distributed computation...