Large-scale FFT on GPU clusters

14 years 5 months ago

Download sei.pku.edu.cn

A GPU cluster is a cluster equipped with GPU devices. Excellent acceleration is achievable for computation-intensive tasks (e.g. matrix multiplication and LINPACK) and bandwidth-intensive tasks with data locality (e.g. ﬁnite-diﬀerence simulation). Bandwidth-intensive tasks such as large-scale FFTs without data locality are harder to accelerate, as the bottleneck often lies with the PCI between main memory and GPU device memory or the communication network between workstation nodes. That means optimizing the performance of FFT for a single GPU device will not improve the overall performance. This paper uses large-scale FFT as an example to show how to achieve substantial speedups for these more challenging tasks on a GPU cluster. Three GPU-related factors lead to better performance: ﬁrstly the use of GPU devices improves the sustained memory bandwidth for processing large-size data; secondly GPU device memory allows larger subtasks to be processed in whole and hence reduces repea...

Yifeng Chen, Xiang Cui, Hong Mei

Real-time Traffic

GPU Cluster | Gpu Device | GPU Device Memory | ICS 2010 |

claim paper

Post Info
More Details (n/a)

Added	19 Jul 2010
Updated	19 Jul 2010
Type	Conference
Year	2010
Where	ICS
Authors	Yifeng Chen, Xiang Cui, Hong Mei

Comments (0)

Sciweavers

Large-scale FFT on GPU clusters

GPU Cluster | Gpu Device | GPU Device Memory | ICS 2010 |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers