This paper proposes a clustering approach that explores both the content and the structure of XML documents for determining similarity among them. Assuming that the content and th...
Combination of structure and content features is necessary for effective retrieval and classification of XML documents. Composite kernels provide a way for fusion of content and s...
A natural consequence of the widespread adoption of XML as standard for information representation and exchange is the redundant storage of large amounts of persistent XML documen...
With the advent of XML as the de facto language for data publishing and exchange, scalable distribution of XML data to large, dynamic populations of consumers remains an important...
In this work, we study similarity measures for text-centric XML documents based on an extended vector space model, which considers both document content and structure. Experimenta...