Distributed management of data is one of the most important problems facing grids. Within the Enabling Grids for Enabling eScience (EGEE) project, currently the world’s largest production grid, a sophisticated hierarchy of data management and storage tools have been developed to help Virtual Organisations (VOs) with this task. In this paper we review the technologies employed for storage and data management in EGEE, and the associated Worldwide LHC Computing Grid (WLCG). We describe from low level networking and site storage technologies, through data transfer and cataloging middleware components. A particular emphasis is placed on deployment of these services in a large scale production environment. We also examine the interface between generic and VO specific data management, taking the example of the ATLAS high energy physics experiment at CERN.
Graeme A. Stewart, David G. Cameron, Greig A. Cowa