Performance Study of LU Decomposition on the Programmable GPU

16 years 2 days ago

Download www-hagi.ist.osaka-u.ac.jp

With the increasing programmability of GPUs (graphics processing units), these units are emerging as an attractive computing platform not only for traditional graphics computation but also for general-purpose computation. In this paper, to study the performance of programmable GPUs, we describe the design and implementation of LU decomposition as an example of numerical computation. To achieve this, we have developed and evaluated some methods with different implementation approaches in terms of (a) loop processing, (b) branch processing, and (c) vector processing. The experimental results give four important points: (1) dependent loops must be implemented through the use of a render texture in order to avoid copies in the video random access memory (VRAM); (2) in most cases, branch processing can be efﬁciently handled by the CPU rather than the GPU; (3) as Fatahalian et al. state for matrix multiplication, we ﬁnd that GPUs require higher VRAM cache bandwidth in order to provide fu...

Fumihiko Ino, Manabu Matsui, Keigo Goda, Kenichi H

Real-time Traffic

Branch Processing | GPUs | HIPC 2005 | LU Decomposition |

claim paper

» Performance Comparison of Graphics Processors to Reconfigurable Logic A Case Study

» A performance study of generalpurpose applications on graphics processors using CUDA

» Accelerating S3D A GPGPU Case Study

» Implementation of 80211n on 128CORE Processor

» Solving path problems on the GPU

» A performance study of multiprocessor task scheduling algorithms

» Belief Propagation by Message Passing in Junction Trees Computing Each Message Faster Usin...

» A Case Study of Selected SPLASH2 Applications and the SBT Debugging Tool

Post Info
More Details (n/a)

Added	27 Jun 2010
Updated	27 Jun 2010
Type	Conference
Year	2005
Where	HIPC
Authors	Fumihiko Ino, Manabu Matsui, Keigo Goda, Kenichi Hagihara

Comments (0)

Sciweavers

Performance Study of LU Decomposition on the Programmable GPU

Branch Processing | GPUs | HIPC 2005 | LU Decomposition |

Explore & Download

Productivity Tools

Sciweavers