Sciweavers

ICS
1997
Tsinghua U.

Optimizing Matrix Multiply Using PHiPAC: A Portable, High-Performance, ANSI C Coding Methodology

14 years 3 months ago
Optimizing Matrix Multiply Using PHiPAC: A Portable, High-Performance, ANSI C Coding Methodology
Modern microprocessors can achieve high performance on linear algebra kernels but this currently requires extensive machine-speci c hand tuning. We have developed a methodology whereby near-peak performance on a wide range of systems can be achieved automatically for such routines. First, by analyzing current machines and C compilers, we've developed guidelines for writing Portable, High-Performance, ANSI C PHiPAC, pronounced fee-pack". Second, rather than code by hand, we produce parameterized code generators. Third, we write search scripts that nd the best parameters for a given system. We report on a BLAS GEMM compatible multi-level cache-blocked matrix multiply generator which produces code that achieves around 90 of peak on the Sparcstation-20 61, IBM RS 6000-590, HP 712 80i, SGI Power Challenge R8k, and SGI Octane R10k, and over 80 of peak on the SGI Indigo R4k. The resulting routines are competitive with vendoroptimized BLAS GEMMs. CS Division, University of Cali...
Jeff Bilmes, Krste Asanovic, Chee-Whye Chin, James
Added 08 Aug 2010
Updated 08 Aug 2010
Type Conference
Year 1997
Where ICS
Authors Jeff Bilmes, Krste Asanovic, Chee-Whye Chin, James Demmel
Comments (0)