A large number of database index structures have been proposed over the last two decades, and little consensus has emerged regarding their relative e ectiveness. In order to empirically evaluate these indexes, it is helpful to have methodologies for generating random queries for performance testing. In this paper we propose a natural, domainindependent approach to the generation of random queries for experimenting with indexes: choose randomly among all logically distinct queries. We investigate this idea in the context of a widely-used and widely-studied indexing workload: range queries over 2-dimensional points. We present an algorithm that chooses randomly among logically distinct 2-d range queries. It has constant-time expected performance over uniformly distributed data, and exhibited good performance in experiments over a variety of real and synthetic data sets. We observe nonuniformities in the way randomly chosen logical 2-d range queries are distributed over a variety of spat...
Joseph M. Hellerstein, Lisa Hellerstein, George Ko