Grid computing has reached the stage where deployments are mature and many collaborations run in production mode. Mature Grid deployments offer the opportunity for revisiting and perhaps updating traditional beliefs related to workload models, which in turn leads to the reevaluation of traditional resource management techniques. This paper analyzes usage patterns in a typical Grid community, a large-scale data-intensive scientific collaboration in high-energy physics. We focus mainly on data usage, since data is the major resource for this class of applications. Our observations led us to propose a new abstraction for resource management in scientific data analysis applications: we define a filecule as a group of files that is always used together. We show that filecules exist and present their characteristics. The existence of filecules suggests a new granularity for data management, which, if incorporated in design, can significantly outperform the traditional solutions for ...