The performance of a dynamic optimization system depends heavily on the code it selects to optimize. Many current systems follow the design of HP Dynamo and select a single interprocedural path, or trace, as the unit of code optimization and code caching. Though this approach to region selection has worked well in practice, we show that it is possible to adapt this basic approach to produce regions with greater locality, less needless code duplication, and fewer profiling counters. In particular, we propose two new region-selection algorithms and evaluate them against Dynamo’s selection mechanism, Next-Executing Tail (NET). Our first algorithm, Last-Executed Iteration (LEI), identifies cyclic paths of execution better than NET, improving locality of execution while reducing the size of the code cache. Our second algorithm allows overlapping traces of similar execution frequency to be combined into a single large region. This second technique can be applied to both NET and LEI, an...
David Hiniker, Kim M. Hazelwood, Michael D. Smith