A scalable communication-aware compilation flow for programmable accelerators

9 years 10 months ago

Download cadlab.cs.ucla.edu

Abstract—Programmable accelerators (PA) are receiving increased attention in domain-speciﬁc architecture designs to provide more general support for customization. In a PA-rich system, computational kernels are compiled into predeﬁned PA templates and dynamically mapped to real PAs at runtime. This imposes a demanding challenge on the compiler side – that is, how to generate high-quality PA mapping code. Another important concern is the communication cost among PAs: if not handled properly at compile time, data transfers among tens or hundreds of accelerators in a PA-rich system will limit the overall performance gain. In this paper we present an eﬃcient PA compilation ﬂow, which is scalable for mapping large computation kernels into PA-rich architectures. Communication overhead is modeled and optimized in the proposed ﬂow to reduce runtime data transfers among accelerators. Experimental results show that for 12 computation-intensive standard benchmarks, the proposed appr...

Jason Cong, Hui Huang 0001, Mohammad Ali Ghodrat

Real-time Traffic