Multi-core processors are planned for virtually all next-generation HPC systems. In a preliminary evaluation of AMD Opteron Dual-Core processor systems, we investigated the scaling behavior of a set of micro-benchmarks, kernels, and applications. In addition, we evaluated a number of processor affinity techniques for managing memory placement on these multi-core systems. We discovered that an appropriate selection of MPI task and memory placement schemes can result in over 25% performance improvement for key scientific calculations. We collected detailed performance data for several large-scale scientific applications. Analyses of the application performance results confirmed our micro-benchmark and scaling results.
Sadaf R. Alam, Richard F. Barrett, Jeffery A. Kueh