In this paper we present the design, implementation and evaluation of a runtime system based on collective I/O techniques for irregular applications. Its main goal is to provide parallel collective I/O, by having all processors participate in the I/O simultaneously, and making the mapping of the I/O requests simpler. Using such a technique, the input/output of the irregular applications can be greatly simplified by always maintaining global files canonically ordered, thus avoiding the utilization of multiple files and the associated sorting/merging steps. The run-time library has been optimized by applying in-memory compression mechanisms to the collective I/O operations. We also present the results of several evaluation experiments obtained by running a particle in cell application on an Intel Paragon machine. Those results demonstrate that significantly high-performance for I/O can be obtained by using our library.
Jesús Carretero, Jaechun No, Alok N. Choudh