Performance modeling and automatic ghost zone optimization for iterative stencil loops on GPUs

16 years 2 days ago

Download www.cs.virginia.edu

Iterative stencil loops (ISLs) are used in many applications and tiling is a well-known technique to localize their computation. When ISLs are tiled across a parallel architecture, there are usually halo regions that need to be updated and exchanged among diﬀerent processing elements (PEs). In addition, synchronization is often used to signal the completion of halo exchanges. Both communication and synchronization may incur signiﬁcant overhead on parallel architectures with shared memory. This is especially true in the case of graphics processors (GPUs), which do not preserve the state of the per-core L1 storage across global synchronizations. To reduce these overheads, ghost zones can be created to replicate stencil operations, reducing communication and synchronization costs at the expense of redundantly computing some values on multiple PEs. However, the selection of the optimal ghost zone size depends on the characteristics of both the architecture and the application, and it ...

Jiayuan Meng, Kevin Skadron

Real-time Traffic

Ghost Zone | Ghost Zone Size | ICS 2009 | Parallel Architectures | Theoretical Computer Science |

claim paper

Added	20 May 2010
Updated	20 May 2010
Type	Conference
Year	2009
Where	ICS
Authors	Jiayuan Meng, Kevin Skadron

Sciweavers

Performance modeling and automatic ghost zone optimization for iterative stencil loops on GPUs

Ghost Zone | Ghost Zone Size | ICS 2009 | Parallel Architectures | Theoretical Computer Science |

Explore & Download

Productivity Tools

Sciweavers