Data-intensive e-science applications often rely on third-party data found in public repositories, whose quality is largely unknown. Although scientists are aware that this uncertainty may lead to incorrect scientific conclusions, in the absence of a quantitative characterization of data quality properties they find it difficult to formulate precise data acceptability criteria. We present an Information Quality management workbench, called Qurator, that supports data experts in the specification of personal quality models, and lets them derive effective criteria for data acceptability. The demo of our working prototype will illustrate our approach on a real e-science workflow for a bioinformatics application. Categories and Subject Descriptors: H.3.3 Information Search and Retrieval: Information filtering General Terms: Management, Measurement, Experimentation.
Alun D. Preece, Binling Jin, Paolo Missier, R. Mar