Sciweavers

DEXA
2010
Springer
319views Database» more  DEXA 2010»
14 years 19 days ago
An Efficient Similarity Join Algorithm with Cosine Similarity Predicate
Given a large collection of objects, finding all pairs of similar objects, namely similarity join, is widely used to solve various problems in many application domains.Computation ...
Dongjoo Lee, Jaehui Park, Junho Shim, Sang-goo Lee
ADBIS
2009
Springer
162views Database» more  ADBIS 2009»
14 years 4 months ago
Efficient Set Similarity Joins Using Min-prefixes
Identification of all objects in a dataset whose similarity is not less than a specified threshold is of major importance for management, search, and analysis of data. Set similari...
Leonardo Ribeiro, Theo Härder
ICDM
2002
IEEE
163views Data Mining» more  ICDM 2002»
14 years 5 months ago
High Performance Data Mining Using the Nearest Neighbor Join
The similarity join has become an important database primitive to support similarity search and data mining. A similarity join combines two sets of complex objects such that the r...
Christian Böhm, Florian Krebs
DEXA
2003
Springer
193views Database» more  DEXA 2003»
14 years 5 months ago
Supporting KDD Applications by the k-Nearest Neighbor Join
Abstract. The similarity join has become an important database primitive to support similarity search and data mining. A similarity join combines two sets of complex objects such t...
Christian Böhm, Florian Krebs
DASFAA
2006
IEEE
183views Database» more  DASFAA 2006»
14 years 6 months ago
Probabilistic Similarity Join on Uncertain Data
An important database primitive for commonly used feature databases is the similarity join. It combines two datasets based on some similarity predicate into one set such that the n...
Hans-Peter Kriegel, Peter Kunath, Martin Pfeifle, ...
ICDE
2008
IEEE
139views Database» more  ICDE 2008»
14 years 6 months ago
Compact Similarity Joins
— Similarity joins have attracted significant interest, with applications in Geographical Information Systems, astronomy, marketing analyzes, and anomaly detection. However, all...
Brent Bryan, Frederick Eberhardt, Christos Falouts...
SIGMOD
2001
ACM
193views Database» more  SIGMOD 2001»
15 years 17 days ago
Epsilon Grid Order: An Algorithm for the Similarity Join on Massive High-Dimensional Data
The similarity join is an important database primitive which has been successfully applied to speed up applications such as similarity search, data analysis and data mining. The s...
Christian Böhm, Bernhard Braunmüller, Fl...
ICDE
1997
IEEE
130views Database» more  ICDE 1997»
15 years 1 months ago
High-Dimensional Similarity Joins
Many emerging data mining applications require a similarity join between points in a high-dimensional domain. We present a new algorithm that utilizes a new index structure, calle...
Kyuseok Shim, Ramakrishnan Srikant, Rakesh Agrawal
ICDE
2009
IEEE
194views Database» more  ICDE 2009»
15 years 2 months ago
Top-k Set Similarity Joins
Abstract-- Similarity join is a useful primitive operation underlying many applications, such as near duplicate Web page detection, data integration, and pattern recognition. Tradi...
Chuan Xiao, Wei Wang 0011, Xuemin Lin, Haichuan Sh...