This paper introduces dynamic object colocation, an optimization to reduce copying costs in generational and other incremental garbage collectors by allocating connected objects t...
Hiding communication latency is an important optimization for parallel programs. Programmers or compilers achieve this by using non-blocking communication primitives and overlappi...
The need to conduct and manage large sets of experiments for scientific applications dramatically increased over the last decade. However, there is still very little tool support ...
— Cell Superscalar (CellSs) provides a simple, flexible and easy programming approach for the Cell Broadband Engine (Cell/B.E.) that automatically exploits the inherent concurre...
Dataflow analyses can have mutually beneficial interactions. Previous efforts to exploit these interactions have either (1) iteratively performed each individual analysis until no...