—The vision of the Semantic Web has brought about new challenges at the intersection of web research and data management. One fundamental research issue at this intersection is the storage of the Resource Description Framework (RDF) data: the model at the core of the Semantic Web. We present a data-centric approach for storage of RDF in relational databases. The intuition behind our approach is that each RDF dataset requires a tailored table schema that achieves efficient query processing by (1) reducing the need for joins in the query plan and (2) keeping null storage below a given threshold. Using a basic structure derived from the RDF data, we propose a two-phase algorithm involving clustering and partitioning. The clustering phase aims to reduce the need for joins in a query. The partitioning phase aims to optimize storage of extra (i.e., null) data in the underlying relational database. Our approach does not assume a particular query workload, relevant for RDF knowledge bases w...
Justin J. Levandoski, Mohamed F. Mokbel