Many contemporary database applications require similarity-based retrieval of complex objects where the only usable knowledge of its domain is determined by a metric distance function. In support of these applications, we explored a search strategy for knearest neighbor searches with MVP-trees that greedily identifies k answers and then improves the answer set monotonically. The algorithm returns an approximate solution when terminated early, as determined by a limiting radius or an internal measure of progress. Given unbounded time the algorithm terminates with an exact solution. Approximate solutions to k-nearest neighbor search provide much needed speed improvement to hard nearest-neighbor problems. Our anytime approximate formulation is well suited for interactive search applications as well as applications where the distance function itself is an approximation. We evaluate the algorithm over a suite of workloads, including image retrieval, biological data and high-dimensional vec...
Weijia Xu, Daniel P. Miranker, Rui Mao, Smriti R.