Technologies for overcoming heterogeneities between autonomous data sources are key in the emerging networked world. In this paper we discuss the initial results of a formal investigation into the underpinnings of technologies for alleviating structural heterogeneity. At the core of structural heterogeneity is the data mapping problem: discovering effective mappings between structured representations of data. Automating the discovery of these mappings is one of the fundamental unsolved challenges for data interoperability, integration, and sharing. We introduce a novel data model and calculus for expressing data mappings between relational data sources, laying the ground for a better understanding of the data mapping problem. This research uncovers several new safety issues in data mapping languages. We discuss ongoing investigations of syntactic and semantic restrictions on the calculus to deal with these issues.
George H. L. Fletcher, Catharine M. Wyss, Edward L