Scientific simulations and experiments use sophisticated data formats to store and access their data. An example of such a format is the Hierarchical Data Format (HDF), commonly used in fusion and plasma physics, geosciences, astronomy and medical research. Most HDF data gets generated remotely (at a remote supercomputer or an experimental site) and is not readily available to the scientists. Transferring the whole data to the users’ machines for analysis and visualization might be prohibitive because of the size of the data, bandwidth limitations or local storage limitations. In addition, different subsets of data may be interesting for analysis at different times. Thus, scientists need a solution for querying and accessing subsets of remote data. This paper describes several approaches to provide an access to remote HDF data and compares their performances. It also gives some details about the solution that we consider the winner – a Globus-based Web Service.
Svetlana G. Shasharina, Chuang Li, Nanbor Wang, Ro