On machines with high-performance processors, the memory system continues to be a performance bottleneck. Compilers insert prefetch operations and reorder data accesses to improve locality, but increasingly seek to modify an application’s data layout to reduce cache miss and page fault penalties. In this paper we discuss Global Variable Layout (GVL), an optimization of the placement of entire static global data objects in the binary. We describe two practical methods for GVL in the HP-UX Integrity optimizing compiler for the Itanium c architecture. The first layout strategy relies on profile feedback, collaboratively employing the compiler, the linker and a pre-link tool to facilitate reordering. The second strategy uses whole-program analysis to drive data layout decisions, and does not require the use of a dynamic profile. We give a detailed description of our implementation and evaluate its performance for the SPEC integer benchmark programs, as well as for a large commercial ...