Sciweavers

GRID
2007
Springer

Data placement for scientific applications in distributed environments

14 years 5 months ago
Data placement for scientific applications in distributed environments
— Scientific applications often perform complex computational analyses that consume and produce large data sets. We are concerned with data placement policies that distribute data in ways that are advantageous for application execution, for example, by placing data sets so that they may be staged into or out of computations efficiently or by replicating them for improved performance and reliability. In particular, we propose to study the relationship between data placement services and workflow management systems. In this paper, we explore the interactions between two services used in large-scale science today. We evaluate the benefits of prestaging data using the Data Replication Service versus using the native data stage-in mechanisms of the Pegasus workflow management system. We use the astronomy application, Montage, for our experiments and modify it to study the effect of input data size on the benefits of data prestaging. As the size of input data sets increases, prestaging usi...
Ann L. Chervenak, Ewa Deelman, Miron Livny, Mei-Hu
Added 07 Jun 2010
Updated 07 Jun 2010
Type Conference
Year 2007
Where GRID
Authors Ann L. Chervenak, Ewa Deelman, Miron Livny, Mei-Hui Su, Robert Schuler, Shishir Bharathi, Gaurang Mehta, Karan Vahi
Comments (0)