Shared memory is an appealing abstraction for parallel programming. It must be implemented with caches in order toperform well, however, and caches require a coherence mechanism t...
We present a metamodel for runtime architecture and demonstrate with experimental results how this metamodel can be used to recover, analyze and improve runtime architecture of mo...
A key challenge in achieving high performance on software DSM systems is overcoming their relatively large communication latencies. In this paper, we consider two techniques which...
In hardware - software codesign paradigm often a performance estimation of the system is needed for hardware - software partitioning. The tremendous growth of application specific...
As the gap between memory and processor speeds continues to widen, cache efficiency is an increasingly important component of processor performance. Compiler techniques have been...