Information retrieval and machine learning for probabilistic schema matching

16 years 5 days ago

Download www.dit.unitn.it

Schema matching is the problem of ﬁnding correspondences (mapping rules, e.g. logical formulae) between heterogeneous schemas e.g. in the data exchange domain, or for distributed IR in federated digital libraries. This paper introduces a probabilistic framework, called sPLMap, for automatically learning schema mapping rules, based on given instances of both schemas. Diﬀerent techniques, mostly from the IR and machine learning ﬁelds, are combined for ﬁnding suitable mapping candidates. Our approach gives a probabilistic interpretation of the prediction weights of the candidates, selects the rule set with highest matching probability, and outputs probabilistic rules which are capable to deal with the intrinsic uncertainty of the mapping process. Our approach with diﬀerent variants has been evaluated on several test sets. Ó 2006 Elsevier Ltd. All rights reserved.

Henrik Nottelmann, Umberto Straccia

Real-time Traffic