We consider the problem of indexing general database workloads (combinations of data sets and sets of potential queries). We dene a framework for measuring the eciency of an indexing scheme for a workload based on two characterizations: storage redundancy (how many times each item in the data set is stored), and access overhead (how many times more blocks than necessary does a query retrieve). Using this frameworkwe present some initial results, showing upper and lower bounds and trade-os between them in the case of multi-dimensional range queries and set queries.
Joseph M. Hellerstein, Elias Koutsoupias, Christos