Sciweavers

SC
2015
ACM

PL2AP: fast parallel cosine similarity search

8 years 7 months ago
PL2AP: fast parallel cosine similarity search
Solving the AllPairs similarity search problem entails finding all pairs of vectors in a high dimensional sparse dataset that have a similarity value higher than a given threshold. The output form this problem is a crucial component in many real-world applications, such as clustering, online advertising, recommender systems, near-duplicate document detection, and query refinement. A number of serial algorithms have been proposed that solve the problem by pruning many of the possible similarity candidates for each query object, after accessing only a few of their non-zero values. The pruning process results in unpredictable memory access patterns that can reduce search efficiency. In this context, we introduce pL2AP, which efficiently solves the AllPairs cosine similarity search problem in a multi-core environment. Our method uses a number of cache-tiling optimizations, combined with fine-grained
David C. Anastasiu, George Karypis
Added 17 Apr 2016
Updated 17 Apr 2016
Type Journal
Year 2015
Where SC
Authors David C. Anastasiu, George Karypis
Comments (0)