An experimental evaluation of tiling and shackling for memory hierarchy management

15 years 12 months ago

Download iss.ices.utexas.edu

On modern computers, the performance of programs is often limited by memory latency rather than by processor cycle time. To reduce the impact of memory latency, the restructuring compiler community has developed localityenhancing program transformations, the most well-known of which is loop tiling. Tiling is restricted to perfectly nested loops, but many imperfectly nested loops can be transformed into perfectly nested loops that can then be tiled. Recently, we proposed an alternative approach to locality enhancement called data shackling. Data shackling reasons about data traversals rather than iteration space traversals, and can be applied directly to imperfectly nested loops. We have implemented shackling in the SGI MIPSPro compiler which already has a sophisticated implementation of tiling. Our experiments on the SGI Octane workstation with dense numerical linear algebra programs show that shackled code obtains double the performance of tiled code for most of these programs, and o...

Induprakas Kodukula, Keshav Pingali, Robert Cox, D

Real-time Traffic

Distributed And Parallel Computing | ICS 1999 | Memory Latency | Nested Loops | SGI MIPSPro Compiler |

claim paper

» Virtual hierarchies to support server consolidation

» A GPGPU compiler for memory optimization and parallelism management

» Optimized Dense Matrix Multiplication on a ManyCore Architecture

» Scalable and Adaptive Metadata Management in Ultra LargeScale File Systems

» A global address space framework for locality aware scheduling of blocksparse computations

» Phase characterization for power evaluating controlflowbased and eventcounterbased techniq...

Post Info
More Details (n/a)

Added	04 Aug 2010
Updated	04 Aug 2010
Type	Conference
Year	1999
Where	ICS
Authors	Induprakas Kodukula, Keshav Pingali, Robert Cox, Dror E. Maydan

Comments (0)

Sciweavers

An experimental evaluation of tiling and shackling for memory hierarchy management

Distributed And Parallel Computing | ICS 1999 | Memory Latency | Nested Loops | SGI MIPSPro Compiler |

Explore & Download

Productivity Tools

Sciweavers