Current relational databases require that a database schema exist prior to data entry and require manual optimization for best performance. We describe the query optimization techniques used by graphd, the schema-last, automatically indexed tuple-store which supports freebase.com, a large world-writable database. Graphd is a log-structured store with a query optimizer based on a functional operator tree over the domain of sorted integer sets which accumulate naturally as tuples are appended to the store. We demonstrate that a set-based optimizer can deliver performance that is roughly comparable to traditional RDBMS query optimization techniques applied to a fixed schema. Categories and Subject Descriptors H.3.1 [Content Analysis and Indexing]: Indexing Methods; H.2.4 [Systems]: Query Processing General Terms PERFORMANCE, ALGORITHMS Keywords Query Optimization, Tuple-Store, Triple-Store, Graph Store, Schema-Last, Database, Object-Oriented
Scott M. Meyer, Jutta Degener, John Giannandrea, B