In this paper, we present an early performance evaluation of a 624-core cluster based on the Intel® Xeon® Processor 5560 (code named “Nehalem-EP”, and referred to as Xeon 5560 in this paper)—the third-generation quad-core architecture from Intel. This is the first processor from Intel with a non-uniform memory access (NUMA) architecture managed by on-chip integrated memory controller. It employs a point-to-point interconnect called the Intel® QuickPath Interconnect (QPI) between processors and to the input/output (I/O) hub. It also introduces to a quad-core architecture both Intel’s hyper-threading technology (or simultaneous multi-threading, “SMT”) and Intel® Turbo Boost Technology (“Turbo mode”) that automatically allow processor cores to run faster than the base operating frequency if the processor is operating below rated power, temperature, and current specification limits. It can be engaged with any number of cores or logical processors enabled and active. We...