Validating Multi-column Schema Matchings by Type

14 years 8 months ago

Download www.cs.utah.edu

Validation of multi-column schema matchings is essential for successful database integration. This task is especially difficult when the databases to be integrated contain little overlapping data, as is often the case in practice (e.g., customer bases of different companies). Based on the intuition that values present in different columns related by a schema matching will have similar "semantic type", and that this can be captured using distributions over values ("statistical types"), we develop a method for validating 1-1 and compositional schema matchings. Our technique is based on three key technical ideas. First, we propose a generic measure for comparing two columns matched by a schema matching, based on a notion of information-theoretic discrepancy that generalizes the standard geometric discrepancy; this provides the basis for 1:1 matching. Second, we present an algorithm for "splitting" the string values in a column to identify substrings that are ...

Bing Tian Dai, Nick Koudas, Divesh Srivastava, Ant

Real-time Traffic

Compositional Schema Matchings | Database | ICDE 2008 | Multi-column Schema Matchings | Schema Matching |

claim paper

Added	01 Nov 2009
Updated	01 Nov 2009
Type	Conference
Year	2008
Where	ICDE
Authors	Bing Tian Dai, Nick Koudas, Divesh Srivastava, Anthony K. H. Tung, Suresh Venkatasubramanian

Sciweavers

Validating Multi-column Schema Matchings by Type

Compositional Schema Matchings | Database | ICDE 2008 | Multi-column Schema Matchings | Schema Matching |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers