Iterative stencil loops (ISLs) are used in many applications and tiling is a well-known technique to localize their computation. When ISLs are tiled across a parallel architecture...
Relays play an important role for increasing rate and reducing energy consumption of wireless networks. In this paper we consider a three-node network (source, relay, destination)...
Optimal network performance is critical to efficient parallel scaling for communication-bound applications on large machines. With wormhole routing, no-load latencies do not increa...
Abhinav Bhatele, Eric J. Bohm, Laxmikant V. Kal&ea...
Software components are modular and can enable post-deployment update, but their high overhead in runtime and memory is prohibitive for many embedded systems. This paper proposes ...
Software or hardware data cache prefetching is an efficient way to hide cache miss latency. However effectiveness of the issued prefetches have to be monitored in order to maximi...