Sciweavers

5 search results - page 1 / 1
» Joining Massive High-Dimensional Datasets
Sort
View
ICDE
2003
IEEE
116views Database» more  ICDE 2003»
14 years 8 months ago
Joining Massive High-Dimensional Datasets
We consider the problem of joining massive datasets. We propose two techniques for minimizing disk I/O cost of join operations for both spatial and sequence data. Our techniques o...
Tamer Kahveci, Christian A. Lang, Ambuj K. Singh
SIGMOD
2001
ACM
193views Database» more  SIGMOD 2001»
14 years 7 months ago
Epsilon Grid Order: An Algorithm for the Similarity Join on Massive High-Dimensional Data
The similarity join is an important database primitive which has been successfully applied to speed up applications such as similarity search, data analysis and data mining. The s...
Christian Böhm, Bernhard Braunmüller, Fl...
ICDE
1997
IEEE
130views Database» more  ICDE 1997»
14 years 8 months ago
High-Dimensional Similarity Joins
Many emerging data mining applications require a similarity join between points in a high-dimensional domain. We present a new algorithm that utilizes a new index structure, calle...
Kyuseok Shim, Ramakrishnan Srikant, Rakesh Agrawal
IDEAL
2004
Springer
14 years 21 days ago
Visualisation of Distributions and Clusters Using ViSOMs on Gene Expression Data
Microarray datasets are often too large to visualise due to the high dimensionality. The self-organising map has been found useful to analyse massive complex datasets. It can be us...
Swapna Sarvesvaran, Hujun Yin
ICDM
2002
IEEE
122views Data Mining» more  ICDM 2002»
14 years 9 days ago
Using Category-Based Adherence to Cluster Market-Basket Data
In this paper, we devise an efficient algorithm for clustering market-basket data. Different from those of the traditional data, the features of market-basket data are known to b...
Ching-Huang Yun, Kun-Ta Chuang, Ming-Syan Chen