In this paper, we have presented the design and evaluation of a compiler system, called APE,for automatic parallelization of scientific and engineering applications on distributed...
Large highly distributed data sets are poorly supported by current query technologies. Applications such as endsystembased network management are characterized by data stored on l...
Dushyanth Narayanan, Austin Donnelly, Richard Mort...
A blossoming paradigm for block-recursive matrix algorithms is presented that, at once, attains excellent performance measured by • time, • TLB misses, • L1 misses, • L2 m...
Gather and scatter are data redistribution functions of longstanding importance to high performance computing. In this paper, we present a highly-general array operator with power...
Steven J. Deitz, Bradford L. Chamberlain, Sung-Eun...
Partitioned parallel radix sort is a parallel radix sort that shortens the execution time by modifying the load balanced radix sort which is known one of the fastest internal sort...
Shin-Jae Lee, Minsoo Jeon, Andrew Sohn, Dongseung ...