On machines with high-performance processors, the memory system continues to be a performance bottleneck. Compilers insert prefetch operations and reorder data accesses to improve...
Nathaniel McIntosh, Sandya Mannarswamy, Robert Hun...
For most parallel and high performance systems, tuning guides provide the users with advices to optimize the execution time of their programs. Execution time may be very sensitive...
Many code analysis techniques for optimization, debugging, or parallelization need to perform runtime disambiguation of sets of addresses. Such operations can be supported efficie...
James Tuck, Wonsun Ahn, Luis Ceze, Josep Torrellas
We define the base polytope B(P, g) of a partially ordered set P and a supermodular function g on the ideals ofP as the convex hull of the incidence vectors of all linear extensio...
The Cell BE processor provides both scalable computation power and flexibility, and it is already being adopted for many computational intensive applications like aerospace, defens...