Abstract. Nearest neighbor searching is a fundamental computational problem. A set of n data points is given in real d-dimensional space, and the problem is to preprocess these points into a data structure, so that given a query point, the nearest data point to the query point can be reported efficiently. Because data sets can be quite large, we are primarily interested in data structures that use only O(dn) storage. A popular class of data structures for nearest neighbor searching is the kd-tree and variants based on hierarchically decomposing space into rectangular cells. An important question in the construction of such data structures is the choice of a splitting method, which determines the dimension and splitting plane to be used at each stage of the decomposition. This choice of splitting method can have a significant influence on the efficiency of the data structure. This is especially true when data and query points are clustered in low dimensional subspaces. This is because c...
Songrit Maneewongvatana, David M. Mount