Abstract. Recent works in XML change detection have focused on detecting changes to ordered or unordered XML documents. However, in real life XML documents may not always be purely ordered or purely unordered. It is indeed possible to have both ordered and unordered nodes in the same XML document (such documents are called hybrid XML). In this paper, we present a technique for detecting the changes to hybrid XML documents. In our approach, old and new versions of XML documents are first stored in a relational database. Then, the order learning module is used to determine the node types in hybrid XML. The change detection module then uses the knowledge of node types to detect the changes by issuing SQL queries. Our experimental results show that our approach produces better result quality compared to existing approaches.
Erwin Leonardi, Sri L. Budiman, Sourav S. Bhowmick