Sciweavers

SDM
2008
SIAM

The Asymmetric Approximate Anytime Join: A New Primitive with Applications to Data Mining

14 years 28 days ago
The Asymmetric Approximate Anytime Join: A New Primitive with Applications to Data Mining
It has long been noted that many data mining algorithms can be built on top of join algorithms. This has lead to a wealth of recent work on efficiently supporting such joins with various indexing techniques. However, there are many applications which are characterized by two special conditions, firstly the two datasets to be joined are of radically different sizes, a situation we call an asymmetric join. Secondly, the two datasets are not, and possibly can not be indexed for some reason. In such circumstances the time complexity is proportional to the product of the number of objects in each of the two datasets, an untenable proposition in most cases. In this work we make two contributions to mitigate this situation. We argue that for many applications, an exact solution to the problem is not required, and we show that by framing the problem as an anytime algorithm we can extract most of the benefit of a join in a small fraction of the time taken by the full algorithm. In situations w...
Lexiang Ye, Xiaoyue Wang, Dragomir Yankov, Eamonn
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2008
Where SDM
Authors Lexiang Ye, Xiaoyue Wang, Dragomir Yankov, Eamonn J. Keogh
Comments (0)