One issue involved in accessing multiple heterogeneous information sources is how to integrate the retrieved data. SIMS, an information mediator, handles this problem by mapping the data in each information source into a common information or domain model. This model defines the relationships between the data in the different sources, so that the data can be integrated properly. Human experts who are familiar with the contents of the information sources are required to generate the model and perform the mapping of the sources. Automating this process of constructing the model of the information sources would be more convenient and efficient. Assuming that the structure of the information sources is a set of tables, the first step in this process is mining the tables to discover relationships in the data. These relationships are then used to further develop the model by helping to determine the necessary classes, superclass/subclass relationships, and associations.
Sheila Tejada, Craig A. Knoblock, Steven Minton