views abstract groups of tasks in a workflow into high level composite tasks, in order to reuse sub-workflows and facilitate provenance analysis. However, unless a view is carefully designed, it may not preserve the dataflow between tasks in the workflow, i.e., it may not be sound. Unsound views can be misleading and cause incorrect provenance analysis. This paper studies the problem of efficiently identifying and correcting unsound workflow views with minimal changes. In particular, given a workflow view, we wish to split each unsound composite task into the minimal number of tasks, such that the resulting view is sound. We prove that this problem is NP-hard by reduction from independent set. We then propose two local optimality conditions (weak and strong), and design polynomial time algorithms for correcting unsound views to meet these conditions. Experiments show that our proposed algorithms are effective and efficient, and that the strong local optimality algorithm produces bette...
Peng Sun, Ziyang Liu, Susan B. Davidson, Yi Chen