The foremost goal of superscalar processor design is to increase performance through the exploitation of instruction-level parallelism (ILP). Previous studies have shown that spec...
Cache misses form a major bottleneck for memory-intensive applications, due to the significant latency of main memory accesses. Loop tiling, in conjunction with other program tran...
Abstract. This paper explores using information about program branch probabilities to optimise reconfigurable designs. The basic premise is to promote utilization by dedicating mo...
When integrating software threads together to boost performance on a processor with instruction-level parallel processing support, it is rarely clear which code regions should be ...
Multi-core processors, with low communication costs and high availability of execution cores, will increase the use of execution and compilation models that use short threads to e...