Effective use of communication networks is critical to the performance and scalability of parallel applications. Partitioned Global Address Space languages like UPC bring the pro...
Modern embedded systems often require high degrees of instruction-level parallelism (ILP) within strict constraints on power consumption and chip cost. Unfortunately, a high-perfo...
Iterative stencil loops (ISLs) are used in many applications and tiling is a well-known technique to localize their computation. When ISLs are tiled across a parallel architecture...
Runtime bloat degrades significantly the performance and scalability of software systems. An important source of bloat is the inefficient use of containers. It is expensive to cre...
Collective operations are an important aspect of the currently most important message-passing programming model MPI (Message Passing Interface). Many MPI applications make heavy u...