Efficient data management is a key component in achieving good performance for scientific workflows in distributed environments. Workflow applications typically communicate data be...
Computational science workflows are generating an ever-increasing volume of data products. Metadata for these workflows is communicated using one or more discipline-specific schem...
We describe an approach for pipelining nested data collections in scientific workflows. Our approach logically delimits arbitrarily nested collections of data tokens using special...
Data lineage and data provenance are key to the management of scientific data. Not knowing the exact provenance and processing pipeline used to produce a derived data set often re...