Although it is increasingly difficult for large scientific programs to attain a significant fraction of peak performance on systems based on microprocessors with substantial instruction level parallelism and with deep memory hierarchies, performance analysis and tuning tools are still not used on a day-to-day basis by algorithm and application designers. We present HPCView--a toolkit for combining multiple sets of program profile data, correlating the data with source code, and generating a database that can be analyzed portably and collaboratively with commodity Web browsers. We argue that HPCView addresses many of the issues that have limited the usability and the utility of most existing tools. We originally built HPCView to facilitate our own work on data layout and optimizing compilers. Now, in addition to daily use within our group, HPCView is being used by several code development teams in DoD and DoE laboratories as well as at NCSA.
John M. Mellor-Crummey, Robert J. Fowler, Gabriel