The paper studies the problem of maintaining external dynamic dictionaries with variable length keys. We introduce a new type of balanced trees, called S(b)-trees, which generalize traditional B-trees. Contrary to B-trees S(b)-trees provide optimal utilization of keys of variable length, while the data access time remains logarithmical, the same as for B-trees. The main property of the new trees is their local incompressibility. That is, any sequence consisting of b + 1 neighboring nodes of the tree cannot be compressed into a b well formed nodes. We prove 1 utilization lower bound for these trees where is inversely proportional to the tree branching. Logarithmic running time algorithms for search, insertion, and deletion are presented. The data structure is a flexible storage solution for semi-structured data and XML databases.
Konstantin V. Shvachko