In this paper we propose DFL -- a formal, graphical workflow language for dataflows, i.e., workflows where large amounts of complex data are manipulated, and the structure of the manipulated data is reflected in the structure of the workflow. It is a common extension of (1) Petri nets, which are responsible for the organization of the processing tasks, and (2) nested relational calculus, which is a database query language over complex objects, and is responsible for handling collections of data items (in particular, for iteration) and for the typing system. We demonstrate that dataflows constructed in a hierarchical manner, according to a set of refinement rules we propose, are semi-sound, i.e., initiated with a single token (which may represent a complex scientific data collection) in the input node, terminate with a single token in the output node (which represents the output data collection). In particular they never leave any "debris data" behind and an output is always ...