Building visualization and analysis pipelines is a large hurdle in the adoption of visualization and workflow systems by domain scientists. In this paper, we propose techniques to help users construct pipelines by consensus--automatically suggesting completions based on a database of previously created pipelines. In particular, we compute correspondences between existing pipeline subgraphs from the database, and use these to predict sets of likely pipeline additions to a given partial pipeline. By presenting these predictions in a carefully designed interface, users can create visualizations and other data products more efficiently because they can augment their normal work patterns with the suggested completions. We present an implementation of our technique in a publicly-available, open-source scientific workflow system and demonstrate efficiency gains in real-world situations.
David Koop, Carlos Eduardo Scheidegger, Steven P.