Intra-Clustering: Accelerating On-chip Communication for Data Parallel Architectures

8 years 10 months ago

Download faculty.cs.tamu.edu

Abstract—Modern computation workloads contain abundant Data Level Parallelism(DLP), which requires specialized data parallel architectures, such as Graphics Processing Units(GPUs). With parallel programming models, such as CUDA and OpenCL, GPUs are easily to be programmed for non-graphics applications, and therefore become a costeffective approach for data parallel architectures. The large quantity of available parallelism places a heavy stress on the memory system as the limited number of pins conﬁnes the number of memory controllers on the chip. This creates a potential bottleneck for performance scalability of the GPUs. To accelerate communication with the memory system, we propose the Intra-Clustering on-chip network for data parallel architectures, which is built upon a traditional two-dimensional electrical mesh network with memory controllers connected through a nanophotonic ring and compute cores grouped into different clusters. Our evaluations with CUDA benchmarks show tha...

Wen Yuan, Rahul Boyapati, Lei Wang, Hyunjun Jang,

Real-time Traffic

Hardware | SBACPAD 2015 |

claim paper

Post Info
More Details (n/a)

Added	17 Apr 2016
Updated	17 Apr 2016
Type	Journal
Year	2015
Where	SBACPAD
Authors	Wen Yuan, Rahul Boyapati, Lei Wang, Hyunjun Jang, Yuho Jin, Ki Hwan Yum, Eun Jung Kim 0001

Comments (0)

Sciweavers

Intra-Clustering: Accelerating On-chip Communication for Data Parallel Architectures

Hardware | SBACPAD 2015 |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers