The interest among a geographically distributed user base to mine massive collections of scientific data propels the need for efficient data dissemination solutions. An optimal data distribution scheme will find the delicate and often application-specific balance among conflicting success metrics such as minimizing transfer times, minimizing the impact on the network, and uniformly distributing load among participants. We use simulations to explore the performance of main classes of data-distribution techniques, some of the successfully deployed by large peer-to-peer communities, in the context of today’s data-centric scientific collaborations. Based on these simulations we derive several recommendations for data distribution in real-world science collaborations.