By optimizing data layout at run-time, we can potentially enhance the performance of caches by actively creating spatial locality, facilitating prefetching, and avoiding cache con...
There is a class of sparse matrix computations, such as direct solvers of systems of linear equations, that change the fill-in (nonzero entries) of the coefficient matrix, and invo...
Offload C++ is an extended version of the C++ language, together with a compiler and runtime system, for automatically offloading general-purpose C++ code to run on the Synergistic...
Alastair F. Donaldson, Uwe Dolinsky, Andrew Richar...
Abstract. Semantics-based program analysis uses an abstract semantics of programs/systems to statically determine run-time properties. Classic examples from compiler technology inc...
Alessandra Di Pierro, Chris Hankin, Herbert Wiklic...
Automatic parallelization is a promising strategy to improve application performance in the multicore era. However, common programming practices such as the reuse of data structur...
Nick P. Johnson, Hanjun Kim, Prakash Prabhu, Ayal ...