Collecting a program’s execution profile is important for many reasons: code optimization, memory layout, program debugging and program comprehension. Path based execution pro...
Abstract. The Parallel Disks Model (PDM) has been proposed to alleviate the I/O bottleneck that arises in the processing of massive data sets. Sorting has been extensively studied ...
: Data distribution is one of the key aspects that a parallelizing compiler for a distributed memory architecture should consider, in order to get efficiency from the system. The ...
The MPI Standard supports derived datatypes, which allow users to describe noncontiguous memory layout and communicate noncontiguous data with a single communication function. Thi...
Surendra Byna, William D. Gropp, Xian-He Sun, Raje...
FPGAs have appealing features such as customizable internal and external bandwidth and the ability to exploit vast amounts of fine-grain parallelism. In this paper we explore the ...