

Source-aware Entity Matching: A Compositional Approach

15 years 5 months ago
Source-aware Entity Matching: A Compositional Approach
Entity matching (a.k.a. record linkage) plays a crucial role in integrating multiple data sources, and numerous matching solutions have been developed. However, the solutions have largely exploited only information available in the mentions and employed a single matching technique. We show how to exploit information about data sources to significantly improve matching accuracy. In particular, we observe that different sources often vary substantially in their level of semantic ambiguity, thus requiring different matching techniques. In addition, it is often beneficial to group and match mentions in related sources first, before considering other sources. These observations lead to a large space of matching strategies, analogous to the space of query evaluation plans considered by a relational optimizer. We propose viewing entity matching as a composition of basic steps into a "match execution plan". We analyze formal properties of the plan space, and show how to find a good ...
Warren Shen, Pedro DeRose, Long Vu, AnHai Doan, Ra
Added 01 Nov 2009
Updated 01 Nov 2009
Type Conference
Year 2007
Where ICDE
Authors Warren Shen, Pedro DeRose, Long Vu, AnHai Doan, Raghu Ramakrishnan
Comments (0)