In recent years XML has became very popular for representing semistructured data and a standard for data exchange over the web. Mining XML data from the web is becoming increasingly important. Several encouraging attempts at developing methods for mining XML data have been proposed. However, efficiency and simplicity are still a barrier for further development. Normally, pre-processing or post-processing are required for mining XML data, such as transforming the data from XML format to relational format. In this paper, we show that extracting association rules from XML documents without any preprocessing or post-processing using XQuery is possible and analyze the XQuery implementation of the well-known Apriori algorithm. In addition, we suggest features that need to be added into XQuery in order to make the implementation of the Apriori algorithm more efficient.
Jacky W. W. Wan, Gillian Dobbie