In SMP clusters it is not always convenient to switch from pure message-passing code to hybrid software designs that exploit shared memory. This paper tackles the problem of restru...
Understanding why the performance of a multithreaded program does not improve linearly with the number of cores in a sharedmemory node populated with one or more multicore process...
In prior work, we have proposed techniques to extend the ease of shared-memory parallel programming to distributed-memory platforms by automatic translation of OpenMP programs to ...
Abstract. Dynamic program optimization is the only recourse for optimizing compilers when machine and program parameters necessary for applying an optimization technique are unknow...
Synchronization is often necessary in parallel computing, but it can create delays whenever the receiving processor is idle, waiting for the information to arrive. This is especia...