Sciweavers

80 search results - page 7 / 16
» O-Cluster: Scalable Clustering of Large High Dimensional Dat...
Sort
View
SDM
2003
SIAM
134views Data Mining» more  SDM 2003»
13 years 10 months ago
Hierarchical Document Clustering using Frequent Itemsets
A major challenge in document clustering is the extremely high dimensionality. For example, the vocabulary for a document set can easily be thousands of words. On the other hand, ...
Benjamin C. M. Fung, Ke Wang, Martin Ester
ICDE
2003
IEEE
146views Database» more  ICDE 2003»
14 years 10 months ago
Similarity Search in Sets and Categorical Data Using the Signature Tree
Data mining applications analyze large collections of set data and high dimensional categorical data. Search on these data types is not restricted to the classic problems of minin...
Nikos Mamoulis, David W. Cheung, Wang Lian
CORR
1999
Springer
222views Education» more  CORR 1999»
13 years 8 months ago
Analysis of approximate nearest neighbor searching with clustered point sets
Abstract. Nearest neighbor searching is a fundamental computational problem. A set of n data points is given in real d-dimensional space, and the problem is to preprocess these poi...
Songrit Maneewongvatana, David M. Mount
INFOVIS
2003
IEEE
14 years 1 months ago
Interactive Hierarchical Dimension Ordering, Spacing and Filtering for Exploration of High Dimensional Datasets
Large numbers of dimensions not only cause clutter in multidimensional visualizations, but also make it difficult for users to navigate the data space. Effective dimension manage...
Jing Yang, Wei Peng, Matthew O. Ward, Elke A. Rund...
OSDI
2004
ACM
14 years 8 months ago
MapReduce: Simplified Data Processing on Large Clusters
MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to ge...
Jeffrey Dean, Sanjay Ghemawat