A key step in program optimization is the determination of optimal values for code optimization parameters such as cache tile sizes and loop unrolling factors. One approach, which is implemented in most compilers, is to use analytical models to determine these values. The other approach, used in library generators like ATLAS, is to perform a global empirical search over the space of parameter values. Neither approach is completely suitable for use in generalpurpose compilers that must generate high quality code for large programs running on complex architectures. Modeldriven optimization may incur a performance penalty of 1020% even for a relatively simple code like matrix multiplication. On the other hand, global search is not tractable for optimizing large programs for complex architectures because the optimization space is too large. In this paper, we advocate a methodology for generating high-performance code without increasing search time dramatically. Our methodology has three c...