Sub-Merge: Diving Down to the Attribute-Value Level in Statistical Schema Matching

10 years 3 months ago

Download www.aaai.org

Matching and merging data from conﬂicting sources is the bread and butter of data integration, which drives search verticals, e-commerce comparison sites and cyber intelligence. Schema matching lifts data integration—traditionally focused on well-structured data—to highly heterogeneous sources. While schema matching has enjoyed signiﬁcant success in matching data attributes, inconsistencies can exist at a deeper level, making full integration difﬁcult or impossible. We propose a more ﬁne-grained approach that focuses on correspondences between the values of attributes across data sources. Since the semantics of attribute values derive from their use and co-occurrence, we argue for the suitability of canonical correlation analysis (CCA) and its variants. We demonstrate the superior statistical and computational performance of multiple sparse CCA compared to a suite of baseline algorithms, on two datasets which we are releasing to stimulate further research. Our crowd-annota...

Zhe Lim, Benjamin I. P. Rubinstein

Real-time Traffic

AAAI 2015 | Intelligent Agents |

claim paper

Post Info
More Details (n/a)

Added	27 Mar 2016
Updated	27 Mar 2016
Type	Journal
Year	2015
Where	AAAI
Authors	Zhe Lim, Benjamin I. P. Rubinstein

Comments (0)

Sciweavers

Sub-Merge: Diving Down to the Attribute-Value Level in Statistical Schema Matching

AAAI 2015 | Intelligent Agents |

Explore & Download

Productivity Tools

Sciweavers