Abstract. As data volumes processed by large-scale distributed dataintensive applications grow at high-speed, an increasing I/O pressure is put on the underlying storage service, which is responsible for data management. One particularly difficult challenge, that the storage service has to deal with, is to sustain a high I/O throughput in spite of heavy access concurrency to massive data. In order to do so, massively parallel data transfers need to be performed, which invariably lead to a high bandwidth utilization. With the emergence of cloud computing, data intensive applications become attractive for a wide public that does not have the resources to maintain expensive large scale distributed infrastructures to run such applications. In this context, minimizing the storage space and bandwidth utilization is highly relevant, as these resources are paid for according to the consumption. This paper evaluates the trade-off resulting from transparently applying data compression to conserv...