All existing proposals for querying XML (e.g., XQuery) rely on a pattern-specification language that allows (1) path navigation and branching through the label structure of the XML data graph, and (2) predicates on the values of specific path/branch nodes, in order to reach the desired data elements. Optimizing such queries depends crucially on the existence of concise synopsis structures that enable accurate compiletime selectivity estimates for complex path expressions over graph-structured XML data. In this paper, we extend our earlier work on structural XSKETCH synopses and we propose an (augmented) XSKETCH synopsis model that exploits localized stability and valuedistribution summaries (e.g., histograms) to accurately capture the complex correlation patterns that can exist between and across path structure and element values in the data graph. We develop a systematic XSKETCH estimation framework for complex path expressions with value predicates and we propose an efficient heuris...
Neoklis Polyzotis, Minos N. Garofalakis