Abstract— Recent research has shown that forwarding speculative data to other processors before it is requested can improve the performance of multiprocessor systems. The most re...
We demonstrate that data reordering can substantially improve the performance of fine-grained irregular sharedmemory benchmarks, on both hardware and software shared-memory syste...
Abstract. This paper gives an overview of locality enhancement techniques used by the Jasmine compiler, currently under development at the University of Toronto. These techniques e...
Tarek S. Abdelrahman, Naraig Manjikian, Gary Liu, ...
We present a method to visit all nodes in a forest of data structures while taking into account object placement. We call the technique a Localized Tracing Scheme as it improves lo...
The memory hierarchy of most multicore systems contains one or more levels of cache that is shared among multiple cores. The shared-cache architecture presents many opportunities f...