This demonstration paper presents a probabilistic XML data merging tool, that represents the outcome of semi-structured document integration as a probabilistic tree. The system is fully automated and integrates methods to evaluate the uncertainty (modeled as probability values) of the result of the merge. It is based on the two-way tree-merge technique and an uncertain data model defined using probabilistic event variables. The resulting probabilistic repository can be queried using a subset of the XPath query language. The demonstration application is based on revisions of the Wikipedia encyclopedia: a Wikipedia article is no longer considered as the latest valid revision but as the merge of all possible revisions, some of which are uncertain. Categories and Subject Descriptors H.2.8 [Database Management]: Database Applications General Terms Algorithms Keywords Probabilistic XML, XML merge, tree merge
Talel Abdessalem, M. Lamine Ba, Pierre Senellart