Mining complex matchings across Web query interfaces

16 years 6 days ago

Download eagle.cs.uiuc.edu

To enable information integration, schema matching is a critical step for discovering semantic correspondences of attributes across heterogeneous sources. As a new attempt, this paper studies such matching as a data mining problem. Speciﬁcally, while complex matchings are common, because of their far more complex search space, most existing techniques focus on simple 1:1 matchings. To tackle this challenge, this paper takes a conceptually novel approach by viewing schema matching as correlation mining, for our task of matching Web query interfaces to integrate the myriad databases on the Internet. On this “deep Web,” query interfaces generally form complex matchings between attribute groups (e.g., {author} corresponds to {ﬁrst name, last name} in the Books domain). We observe that the co-occurrences patterns across query interfaces often reveal such complex semantic relationships: grouping attributes (e.g., {ﬁrst name, last name}) tend to be co-present in query interfaces an...

Bin He, Kevin Chen-Chuan Chang, Jiawei Han

Real-time Traffic