Applications like multimedia retrieval require efficient support for similarity search on large data collections. Yet, nearest neighbor search is a difficult problem in high dimensional spaces, rendering efficient applications hard to realize: index structures degrade rapidly with increasing dimensionality, while sequential search is not an attractive solution for repositories with millions of objects. This paper approaches the problem from a different angle. A solution is sought in an unconventional storage scheme, that opens up a new range of techniques for processing k-NN queries, especially suited for high dimensional spaces. The suggested (physical) database design accommodates well a novel variant of branch-and-bound search, that reduces the high dimensional space quickly to a small candidate set. The paper provides insight in applying this idea to k-NN search using two similarity metrics commonly encountered in image database applications, and discusses techniques for its imple...
Arjen P. de Vries, Nikos Mamoulis, Niels Nes, Mart