This paper provides an overview of the Blue Matter application development effort within the Blue Gene project that supports our scientific simulation efforts in the areas of protein folding and membrane-protein systems. The design philosophy of the Blue Gene/L architecture relies on large numbers of power efficient nodes (whose technology is derived from the world of embedded microprocessors) to enable packing of many such nodes into a small volume to achieve high performance. In order for an application to exploit the potential of this architecture, the application must scale well to large node counts. Because the scientific goals of the project entail simulating very long time-scales, up to microseconds, strong scaling of a fixed size problem to these large node counts is a requirement. In pursuit of this objective we have considered a variety of parallel decompositions and explored ways to exploit and map algorithms onto the two primary high performance interconnects provided ...
Robert S. Germain, Blake G. Fitch, Aleksandr Raysh