Sciweavers

125 search results - page 13 / 25
» Finding Skew Partitions Efficiently
Sort
View
CIDM
2007
IEEE
14 years 1 months ago
Scalable Clustering for Large High-Dimensional Data Based on Data Summarization
Clustering large data sets with high dimensionality is a challenging data-mining task. This paper presents a framework to perform such a task efficiently. It is based on the notio...
Ying Lai, Ratko Orlandic, Wai Gen Yee, Sachin Kulk...
PAMI
2006
134views more  PAMI 2006»
13 years 7 months ago
A Genetic Algorithm Using Hyper-Quadtrees for Low-Dimensional K-means Clustering
The k-means algorithm is widely used for clustering because of its computational efficiency. Given n points in d-dimensional space and the number of desired clusters k, k-means see...
Michael Laszlo, Sumitra Mukherjee
DKE
2008
79views more  DKE 2008»
13 years 7 months ago
Extracting k most important groups from data efficiently
We study an important data analysis operator, which extracts the k most important groups from data (i.e., the k groups with the highest aggregate values). In a data warehousing co...
Man Lung Yiu, Nikos Mamoulis, Vagelis Hristidis
EDBT
2008
ACM
169views Database» more  EDBT 2008»
14 years 7 months ago
Efficient online top-K retrieval with arbitrary similarity measures
The top-k retrieval problem requires finding k objects most similar to a given query object. Similarities between objects are most often computed as aggregated similarities of the...
Prasad M. Deshpande, Deepak P, Krishna Kummamuru
SIGMOD
2009
ACM
269views Database» more  SIGMOD 2009»
14 years 7 months ago
Efficient approximate entity extraction with edit distance constraints
Named entity recognition aims at extracting named entities from unstructured text. A recent trend of named entity recognition is finding approximate matches in the text with respe...
Wei Wang 0011, Chuan Xiao, Xuemin Lin, Chengqi Zha...