The present paper analyzes the usefulness of the normalized compression distance for the problem to cluster the hemagglutinin (HA) sequences of influenza virus data for the HA gene...
The Dryad and DryadLINQ systems offer a new programming model for large scale data-parallel computing. They generalize previous execution environments such as SQL and MapReduce in...
The use of a cluster for distributed performance analysis of parallel trace data is discussed. We propose an analysis architecture that uses multiple cluster nodes as a server to ...
Abstract. We describe a clustering approach with the emphasis on detecting coherent structures in a complex dataset, and illustrate its effectiveness with computer vision applicat...
Given the pairwise affinity relations associated with a set of data items, the goal of a clustering algorithm is to automatically partition the data into a small number of homogen...