Previous work in change detection to XML documents is not suitable for detecting the changes to large XML documents as it requires a lot of memory to keep the two versions of XML documents in the memory. In this article, we take a more conservative yet novel approach of using traditional relational database engines for detecting the changes to large ordered XML documents. To this end, we have implemented a prototype system called Xandy that converts XML documents into relational tuples and detects the changes from these tuples by using SQL queries. Our experimental results show that the relational-based approach has better scalability compared to published algorithm like X-Diff. It has comparable efficiency and result quality compared to X-Diff in some cases. Our experimental results also show that, generally, Xandy has better result quality than XyDiff.
Erwin Leonardi, Sourav S. Bhowmick