Sciweavers

SIGMOD
2010
ACM

GDR: a system for guided data repair

14 years 15 days ago
GDR: a system for guided data repair
Improving data quality is a time-consuming, labor-intensive and often domain specific operation. Existing data repair approaches are either fully automated or not efficient in interactively involving the users. We present a demo of GDR, a Guided Data Repair system that uses a novel approach to efficiently involve the user alongside automatic data repair techniques to reach better data quality as quickly as possible. Specifically, GDR generates data repairs and acquire feedback on them that would be most beneficial in improving the data quality. GDR quantifies the data quality benefit of generated repairs by combining mechanisms from decision theory and active learning. Based on these benefit scores, groups of repairs are ranked and displayed to the user. User feedback is used to train a machine learning component to eventually replace the user in deciding on the validity of a suggested repair. We describe how the generated repairs are ranked and displayed to the user in a "useful...
Mohamed Yakout, Ahmed K. Elmagarmid, Jennifer Nevi
Added 06 Dec 2010
Updated 06 Dec 2010
Type Conference
Year 2010
Where SIGMOD
Authors Mohamed Yakout, Ahmed K. Elmagarmid, Jennifer Neville, Mourad Ouzzani
Comments (0)