To prepare for future peta- or exa-scale computing, it is important to gain a good understanding on what impacts a hierarchical storage system would have on the performance of data-intensive applications, and accordingly, how to leverage its strengths and mitigate possible risks. To this aim, this paper adopts a user-level perspective to empirically reveal the implications of storage organization to parallel programs running on Jaguar at the Oak Ridge National Laboratory. We first describe the hierarchical configuration of Jaguar's storage system. Then we evaluate the performance of individual storage components. In addition, we examine the scalability of metadata- and data-intensive benchmarks over Jaguar. We have discovered that the file distribution pattern can impact the aggregated I/O bandwidth. Based on our analysis, we have demonstrated that it is possible to improve the scalability of a representative application S3D by as much as 15%.
Weikuan Yu, Sarp Oral, Shane Canon, Jeffrey S. Vet