Nested data-parallelism on the gpu

12 years 2 months ago

Download people.cs.uchicago.edu

Graphics processing units (GPUs) provide both memory bandwidth and arithmetic performance far greater than that available on CPUs but, because of their Single-Instruction-Multiple-Data (SIMD) architecture, they are hard to program. Most of the programs ported to GPUs thus far use traditional data-level parallelism, performing only operations that operate uniformly over vectors. NESL is a ﬁrst-order functional language that was designed to allow programmers to write irregular-parallel programs — such as parallel divide-and-conquer algorithms — for wide-vector parallel computers. This paper presents our port of the NESL implementation to work on GPUs and provides empirical evidence that nested data-parallelism (NDP) on GPUs signiﬁcantly outperforms CPUbased implementations and matches or beats newer GPU languages that support only ﬂat parallelism. While our performance does not match that of hand-tuned CUDA programs, we argue that the notational conciseness of NESL is worth th...

Lars Bergstrom, John H. Reppy

Real-time Traffic

Data Parallelism | ICFP 2012 | Irregular Parallel Programs | Programming Languages | Programming Languages Processors |

claim paper

Post Info
More Details (n/a)

Added	29 Sep 2012
Updated	29 Sep 2012
Type	Journal
Year	2012
Where	ICFP
Authors	Lars Bergstrom, John H. Reppy

Comments (0)

Sciweavers

Nested data-parallelism on the gpu

Data Parallelism | ICFP 2012 | Irregular Parallel Programs | Programming Languages | Programming Languages Processors |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers