Abstract— High-level synthesis (HLS) of memory-intensive applications has featured several innovations in terms of enhancements made to the basic memory organization and data lay...
A GPU cluster is a cluster equipped with GPU devices. Excellent acceleration is achievable for computation-intensive tasks (e.g. matrix multiplication and LINPACK) and bandwidth-i...
One of the fundamental problems in distributed computing is how to efficiently perform routing in a faulty network in which each link fails with some probability. This paper inves...
The problem attacked in this paper is one of automatically mapping an application onto a Network-on-Chip (NoC) based chip multiprocessor (CMP) architecture in a locality-aware fas...
Guangyu Chen, Feihui Li, Seung Woo Son, Mahmut T. ...
3D integration technology greatly increases transistor density while providing faster on-chip communication. 3D implementations of processors can simultaneously provide both laten...