To improve data availability and resilience MapReduce frameworks use file systems that replicate data uniformly. However, analysis of job logs from a large production cluster show...
Abstract. Cloud computing is an emerging paradigm to provide Infrastructure as a Service (IaaS). In this paper we present NEPTUNE-IaaS, a software system able to support the whole ...
Vittorio Manetti, Pasquale Di Gennaro, Roberto Bif...
Normal mixture models are widely used for statistical modeling of data, including cluster analysis. However maximum likelihood estimation (MLE) for normal mixtures using the EM al...
This paper introduces three new contributions to the problems of image classification and image search. First, we propose a new image patch quantization algorithm. Other competitiv...
In this paper we present the design of a modern course in cluster computing and large-scale data processing. The defining differences between this and previously published designs...
Aaron Kimball, Sierra Michels-Slettvet, Christophe...