Most RDF query languages allow for graph structure search through a conjunction of triples which is typically processed using join operations. A key factor in optimizing joins is determining the join order which depends on the expected cardinality of intermediate results. This work proposes a pattern-based summarization framework for estimating the cardinality of RDF graph patterns. We present experiments on real world and synthetic datasets which confirm the feasibility of our approach. Categories and Subject Descriptors H.2.4 [Systems]: Query processing General Terms: Management, Performance.
Angela Maduko, Kemafor Anyanwu, Amit P. Sheth, Pau