Abstract. We present a clustering method for continuous data. It defines local clusters into the (primary) data space but derives its similarity measure from the posterior distribu...
Extendible hashing is a kind of fast indexing technology; it provides with a way of storing structural data records so that each of them can be gotten very quickly. In this paper,...
Abstract. This paper presents a probabilistic model for combining cluster ensembles utilizing information theoretic measures. Starting from a co-association matrix which summarizes...
Supporting top-k queries over distributed collections of schemaless XML data poses two challenges. While XML supports expressive query languages such as XPath and XQuery, these la...
This paper addresses the problem of scheduling concurrent jobs on clusters where application data is stored on the computing nodes. This setting, in which scheduling computations ...
Michael Isard, Vijayan Prabhakaran, Jon Currey, Ud...