In data dominated applications, like multi-media and telecom applications, data storage and transfers are the most important factors in terms of energy consumption, area and system performance. Several steps which optimize these costs are present in our systematic Data Transfer and Storage Exploration methodology. In the important step discussed in this paper, the cycle budget available for background storage transfers is globally distributed over the application's memory accesses that are typically grouped in the loop and function hierarchy. This is crucial for meeting the real-time constraints with a customized memory organisation without counteracting the memory size and energy budget optimizations achieved by earlier steps in our script. This paper proves the effectiveness of the prototype tool on driver applications of several application domains. It clearly shows the tradeoff between power, area and speed.