A key step in the optimization of declarative queries over XML data is estimating the selectivity of path expressions, i.e., the number of elements reached by a specific navigation pattern through the XML data graph. Recent studies have introduced XSketch structural graph synopses as an effective, space-efficient tool for the compile-time estimation of complex path-expression selectivities over graph-structured, schema-less XML data. Briefly, XSketches exploit localized graph stability and well-founded statistical assumptions to accurately approximate the path and branching distribution in the underlying XML data graph. Empirical results have demonstrated the effectiveness of XSketch summaries over real-life and synthetic data sets, and for a variety of path-expression workloads. In this paper, we introduce fractional XSketches (fXSketches) a simple, yet intuitive and very effective generalization of the basic XSketch summarization mechanism. In a nutshell, our fXSketch synopsis ext...
Natasha Drukh, Neoklis Polyzotis, Minos N. Garofal