Hiding communication latency is an important optimization for parallel programs. Programmers or compilers achieve this by using non-blocking communication primitives and overlappi...
Synchronization primitives for large shared-memory multiprocessors need to minimize latency and contention. Software queue-based locks address these goals, but suffer if a process...
Robert W. Wisniewski, Leonidas I. Kontothanassis, ...
Stream compaction is a common parallel primitive used to remove unwanted elements in sparse data. This allows highly parallel algorithms to maintain performance over several proce...
CODE 2.0 is a graphical parallel programming system that targets the three goals of ease of use, portability, and production of efficient parallel code. Ease of use is provided by...
We explore an algebraic language for networks consisting of a xed number of reactive units, communicating synchronously over a xed linking structure. The language has only two ope...