One way to find closest pairs in large datasets is to use hash functions [6], [12]. In recent years locality-sensitive hash functions for various metrics have been given: projecting an n-cube onto k bits is simple hash function that performs well. In this paper we investigate alternatives to projection. For various parameters hash functions given by complete decoding algorithms for codes work better, and asymptotically random codes perform better than projection.
Daniel M. Gordon, Victor Miller, Peter Ostapenko