A trend is developing in high performance computing in which commodity processors are coupled to various types of computational accelerators. Such systems are commonly called hybrid systems. In this paper, we describe our experience developing an implementation of the Linpack benchmark for a petascale hybrid system, the LANL Roadrunner cluster built by IBM for Los Alamos National Laboratory. This system combines traditional x8664 host processors with IBM PowerXCell™ 8i accelerator processors. The implementation of Linpack we developed was the
Michael Kistler, John A. Gunnels, Daniel A. Broken