Sciweavers

JPDC
2011

BlobSeer: Next-generation data management for large scale infrastructures

13 years 6 months ago
BlobSeer: Next-generation data management for large scale infrastructures
As data volumes increase at a high speed in more and more application fields of science, engineering, information services, etc., the challenges posed by data-intensive computing gain an increasing importance. The emergence of highly scalable infrastructures, e.g. for cloud computing and for petascale computing and beyond introduces additional issues for which scalable data management becomes an immediate need. This paper brings several contributions. First, it proposes a set of principles for designing highly scalable distributed storage systems that are optimized for heavy data access concurrency. In particular, we highlight the potentially large benefits of using versioning in this context. Second, based on these principles, we propose a set of versioning algorithms, both for data and metadata, that enable a high throughput under concurrency. Finally, we implement and evaluate these algorithms in the BlobSeer prototype, that we integrate as a storage backend in the Hadoop MapRedu...
Bogdan Nicolae, Gabriel Antoniu, Luc Bougé,
Added 14 May 2011
Updated 14 May 2011
Type Journal
Year 2011
Where JPDC
Authors Bogdan Nicolae, Gabriel Antoniu, Luc Bougé, Diana Moise, Alexandra Carpen-Amarie
Comments (0)