Abstract. Automated online search is a powerful technique for performance diagnosis. Such a search can change the types of experiments it performs while the program is running, making decisions based on live performance data. Previous research has addressed search speed and scaling searches to large codes and many nodes. This paper explores using a finer granularity for the bottlenecks that we locate in an automated online search, i.e., refining the search to bottlenecks localized to loops. The ability to insert and remove instrumentation on-the-fly means an online search can utilize fine-grain program structure in ways that are infeasible using other performance diagnosis techniques. We automatically detect loops in a program’s binary control flow graph and use this information to efficiently instrument loops. We implemented our new strategy in an existing automated online performance tool, Paradyn. Results for several sequential and parallel applications show that a loop-aware...
Eli D. Collins, Barton P. Miller