—We present Huckleberry, a tool for automatically generating parallel implementations for multi-core platforms from sequential recursive divide-and-conquer programs. The recursiv...
Rebecca L. Collins, Bharadwaj Vellore, Luca P. Car...
Abstract. We show how computations such as those involved in American or European-style option price valuations with the explicit finite difference method can be performed in par...
Although stencil auto-tuning has shown tremendous potential in effectively utilizing architectural resources, it has hitherto been limited to single kernel instantiations; in addi...
Shoaib Kamil, Cy Chan, Leonid Oliker, John Shalf, ...
Recently, the number of cores on general-purpose processors has been increasing rapidly. Using conventional programming models, it is challenging to effectively exploit these core...
Jayanth Gummaraju, Joel Coburn, Yoshio Turner, Men...
Abstract--Loop tiling is an important compiler transformation used for enhancing data locality and exploiting coarsegrained parallelism. Tiled codes in which tile sizes are runtime...
Albert Hartono, Muthu Manikandan Baskaran, J. Rama...