Data-parallel accelerator devices such as Graphical Processing Units (GPUs) are providing dramatic performance improvements over even multicore CPUs for lattice-oriented applications in computational physics. Models such as the Ising and Potts models continue to play a role in investigating phase transitions on small-world and scale-free graph structures. These models are particularly well-suited to the performance gains possible using GPUs and relatively high-level device programming languages such as NVIDIA’s Compute Unified Device Architecture (CUDA). We report on algorithms and CUDA data-parallel programming techniques for implementing Metropolis Monte Carlo updates for the Ising using bit-packing storage, and adjacency neighbour lists for various graph structures in addition to regular hypercubic lattices. We report on parallel performance gains and also memory and performance tradeoffs using GPU/CPU and algorithmic combinations.
Kenneth A. Hawick, Arno Leist, Daniel P. Playne