In the area of Grid computing, there is a growing need to process large amounts of data. To support this trend, we need to develop efficient parallel storage systems that can provide for high performance for data-intensive applications. In order to overcome I/O bottlenecks and to increase I/O parallelism, data streams need to be parallelized at both the application level and the storage device level. In this paper, we propose a novel Peer-to-Peer(P2P) storage architecture for MPI applications on Grid systems. We first present an analytic model of our P2P storage architecture. Next, we describe a profile-guided data allocation algorithm that can increase the degree of I/O parallelism present in the system, as well as to balance I/O in a heterogeneous system. We present results on an actual implementation. Our experimental results show that by partitioning data across all available storage devices and carefully tuning I/O workloads in the Grid system, our Peer-to-Peer scheme can deliv...
Yijian Wang, David R. Kaeli