Whenever large homogeneous data structures need to be processed in a non-trivial way, e.g. in computational sciences, image processing, or system simulation, high-level array prog...
In this work, we present several compiler optimizations to reduce the overhead due to software protection. We first propose an aggressive rematerialization algorithm which attempt...
Fetching instructions from a set-associative cache in an embedded processor can consume a large amount of energy due to the tag checks performed. Recent proposals to address this ...
Timothy M. Jones, Sandro Bartolini, Bruno De Bus, ...
Helper threading is a technique that utilizes a second core or logical processor in a multi-threaded system to improve the performance of the main thread. A helper thread executes...
Array remappings are useful to many applications on distributed memory parallel machines. They are available in High Performance Fortran, a Fortran-based data-parallel language. T...