Sciweavers

SIGMOD
2002
ACM

Approximate XML joins

14 years 11 months ago
Approximate XML joins
XML is widely recognized as the data interchange standard for tomorrow, because of its ability to represent data from a wide variety of sources. Hence, XML is likely to be the format through which data from multiple sources is integrated. In this paper we study the problem of integrating XML data sources through correlations realized as join operations. A challenging aspect of this operation is the XML document structure. Two documents might convey approximately or exactly the same information but may be quite different in structure. Consequently approximate match in structure, in addition to, content has to be folded in the join operation. We quantify approximate match in structure and content using well defined notions of distance. For structure, we propose computationally inexpensive lower and upper bounds for the tree edit distance metric between two trees. We then show how the tree edit distance, and other metrics that quantify distance between trees, can be incorporated in a joi...
Sudipto Guha, H. V. Jagadish, Nick Koudas, Divesh
Added 08 Dec 2009
Updated 08 Dec 2009
Type Conference
Year 2002
Where SIGMOD
Authors Sudipto Guha, H. V. Jagadish, Nick Koudas, Divesh Srivastava, Ting Yu
Comments (0)