A technique called user views has recently been proposed to focus user attention on relevant information in response to provenance queries over workflow executions [1, 2]: Given user input on what modules in the workflow specification are relevant to the user, a user view is a concise representation that clusters together modules to create a small number of composite modules (or clusters) such that (1) each composite module in a user view contains at most one relevant (atomic) module, thus assuming the "meaning" of that module; and (2) no control or data dependencies (either direct or indirect) are introduced (soundness) or removed (completeness) between relevant modules. The goal is to find a user view with a smallest number of composite modules. We show that for workflow specifications that are general graphs, regardless of the number of distinct modules in the input workflow and the structure of interaction between them, there always exists a user view of size at most (2k...
Olivier Biton, Susan B. Davidson, Sanjeev Khanna,