High-dimensional problems arising from robot motion planning, biology, data mining, and geographic information systems often require the computation of k nearest neighbor (knn) gr...
Sampling is a widely used technique to increase efficiency in database and data mining applications operating on large dataset. In this paper we present a scalable sampling imple...
High-dimensional index is one of the most challenging tasks for content-based video retrieval (CBVR). Typically, in video database, there exist two kinds of clues for query: visual...
Zhiping Shi, Qingyong Li, Zhiwei Shi, Zhongzhi Shi
Approximate Nearest Neighbor (ANN) methods such as Locality Sensitive Hashing, Semantic Hashing, and Spectral Hashing, provide computationally ecient procedures for nding objects...
We simulate different architectures of a distributed Information Retrieval system on a very large Web collection, in order to work out the optimal setting for a particular set of r...