In this paper, we describe and situate the TUPELO system for data mapping in relational databases. Automating the discovery of mappings between structured data sources is a long standing and important problem in data management. Starting from user provided example instances of the source and target schemas, TUPELO approaches mapping discovery as search within the transformation space of these instances based on a set of mapping operators. TUPELO mapping expressions incorporate not only data-metadata transformations, but also simple and complex semantic transformations, resulting in significantly wider applicability than previous systems. Extensive empirical validation of TUPELO, both on synthetic and real world datasets, indicates that the approach is both viable and effective.
George H. L. Fletcher, Catharine M. Wyss