We present Data Auditor, a tool for exploring data quality and data semantics. Given a rule or an integrity constraint and a target relation, Data Auditor computes pattern tableaux, which concisely summarize subsets of the relation that (mostly) satisfy or (mostly) fail the constraint. This paper describes 1) the architecture and user interface of Data Auditor, 2) the supported constraints for testing data consistency and completeness, 3) the heuristics used by Data Auditor to “tune” a given constraint or its associated parameters for better fit with the data, and 4) several demonstration scenarios. using real data sets.
Lukasz Golab, Howard J. Karloff, Flip Korn, Divesh