This paper will discuss high performance clustering from a series of critical topics: architectural design, system software infrastructure, and programming environment. This will be accomplished through an overview of a large scale, high performance SuperCluster (named Roadrunner) in production at The University of New Mexico (UNM) Albuquerque High Performance Computing Center (AHPCC). This SuperCluster, sponsored by the U.S. National Science Foundation (NSF) and the National Computational Science Alliance (NCSA), is based almost entirely on freelyavailable, vendor-independent software. For example, its operating system (Linux), job scheduler (PBS), compilers (GNU/EGCS), and parallel programming libraries (MPI). The Globus toolkit, also available for this platform, allows high performance distributed computing applications to use geographically distributed resources such as this SuperCluster. In addition to describing the design and analysis of the Roadrunner SuperCluster, we provide ...
David A. Bader, Arthur B. Maccabe, Jason R. Mastal