Sciweavers

17390 search results - page 20 / 3478
» Distributed Data Clustering
Sort
View
IDEAL
2000
Springer
13 years 11 months ago
Clustering by Similarity in an Auxiliary Space
Abstract. We present a clustering method for continuous data. It defines local clusters into the (primary) data space but derives its similarity measure from the posterior distribu...
Janne Sinkkonen, Samuel Kaski
CLUSTER
2003
IEEE
14 years 20 days ago
Optimized Implementation of Extendible Hashing to Support Large File System Directory
Extendible hashing is a kind of fast indexing technology; it provides with a way of storing structural data records so that each of them can be gotten very quickly. In this paper,...
Rongfeng Tang, Dan Meng, Sining Wu
MCS
2004
Springer
14 years 22 days ago
A Probabilistic Model Using Information Theoretic Measures for Cluster Ensembles
Abstract. This paper presents a probabilistic model for combining cluster ensembles utilizing information theoretic measures. Starting from a co-association matrix which summarizes...
Hanan Ayad, Otman A. Basir, Mohamed Kamel
ICDE
2008
IEEE
166views Database» more  ICDE 2008»
14 years 8 months ago
A Clustered Index Approach to Distributed XPath Processing
Supporting top-k queries over distributed collections of schemaless XML data poses two challenges. While XML supports expressive query languages such as XPath and XQuery, these la...
Georgia Koloniari, Evaggelia Pitoura
SOSP
2009
ACM
14 years 4 months ago
Quincy: fair scheduling for distributed computing clusters
This paper addresses the problem of scheduling concurrent jobs on clusters where application data is stored on the computing nodes. This setting, in which scheduling computations ...
Michael Isard, Vijayan Prabhakaran, Jon Currey, Ud...