Existing concurrency platforms for dynamic multithreading do not provide repeatable parallel random-number generators. This paper proposes that a mechanism called pedigrees be bui...
We propose the first polynomial-time code selection algorithm for minimising the worst-case execution time of a nonnested loop executed on a fully pipelined processor that uses sc...
Abstract. Linear scan register allocation is an attractive register allocation algorithm because of its simplicity and fast running time. However, it is generally felt that linear ...
Instruction reuse is a microarchitectural technique that improves the execution time of a program by removing redundant computations at run-time. Although this is the job of an op...
As on-chip networks become prevalent in multiprocessor systemson-a-chip and multi-core processors, they will be an integral part of the design flow of such systems. With power in...