Data transformation often requires users to write many trivial and task-dependent programs to transform thousands of records. Recently, programming-by-example approaches enable users to transform data without coding. A key challenge of these PBE approaches is to deliver correctly transformed results on large datasets, as these transformation programs are likely to be generated by non-expert users. To address this challenge, existing approaches aim to identify a small set of potentially incorrect records and ask users to examine these records instead of the entire dataset. However, as the transformation scenarios are highly task-dependent, existing approaches cannot capture the incorrect records for various scenarios. In this paper, our approach learns from past transformation scenarios to generate a meta-classifier to identify the incorrect records. Our approach color-codes these transformed records and then presents them for users to examine. The approach allows users to either ente...
Bo Wu, Craig A. Knoblock