Global addressing of shared data simplifies parallel programming and complements message passing models commonly found in distributed memory machines. A number of programming sys...
Beng-Hong Lim, Chi-Chao Chang, Grzegorz Czajkowski...
To ensure low power consumption while maintaining flexibility and performance, future Systems-on-Chip (SoC) will combine several types of processor cores and data memory units of...
Anthony Leroy, Paul Marchal, Adelina Shickova, Fra...
A multi-cabinet implementation of a combined input and crosspoint queued (CICQ) switch introduces a large RTT latency between the line cards and switch fabric, requiring a large cr...
The frequency of accesses to remote data is a key factor affecting the performance of all Distributed Shared Memory (DSM) systems. Remote data caching is one of the most effective...
Streamlining communication is key to achieving good performance in shared-memory parallel programs. While full hardware support for cache coherence generally offers the best perfo...