In this paper we propose and evaluate the Adaptive++ technique, a novel runtime-only data prefetching strategy for software-based distributed shared-memory systems (software DSMs)...
Ricardo Bianchini, Raquel Pinto, Claudio Luis de A...
Abstract. Limited bandwidth to off-chip main memory is a performance bottleneck in chip multiprocessors for streaming computations, such as Cell/B.E., and this will become even mor...
Dynamic optimization has the potential to adapt the program’s behavior at run-time to deliver performance improvements over static optimization. Dynamic optimization systems usu...
The effectiveness of the memory hierarchy is critical for the performance of current processors. The performance of the memory hierarchy can be improved by means of program transf...
This paper presents the Cameron Project 1 , which aims to provide a high level, algorithmic language and optimizing compiler for the development of image processing applications o...