Fast retrieval methods are critical for large-scale and
data-driven vision applications. Recent work has explored
ways to embed high-dimensional features or complex distance
functions into a low-dimensional Hamming space
where items can be efficiently searched. However, existing
methods do not apply for high-dimensional kernelized
data when the underlying feature embedding for the kernel
is unknown. We show how to generalize locality-sensitive
hashing to accommodate arbitrary kernel functions, making
it possible to preserve the algorithm’s sub-linear time similarity
search guarantees for a wide class of useful similarity
functions. Since a number of successful image-based kernels
have unknown or incomputable embeddings, this is especially
valuable for image retrieval tasks. We validate our
technique on several large-scale datasets, and show that it
enables accurate and fast performance for example-based
object classification, feature matching, and content-based
retri...