Over the last decade, a dramatic increase has been observed in the need for generating and organising data in the course of large parameter studies, performance analysis, and soft...
Radu Prodan, Thomas Fahringer, Michael Geissler, G...
Different from sequential programs, parallel programs possess their own characteristics which are difficult to analyze in the multi-process or multi-thread environment. This paper...
Xu Liu, Lin Yuan, Jianfeng Zhan, Bibo Tu, Dan Meng
Abstract— We developed an automated environment to measure the memory access behavior of applications on high performance clusters. Code optimization for processor caches is cruc...
Existing supercomputers have hundreds of thousands of processor cores, and future systems may have hundreds of millions. Developers need detailed performance measurements to tune ...
Todd Gamblin, Bronis R. de Supinski, Martin Schulz...
Distributed applications, especially the ones being I/O intensive, often access the storage subsystem in a non-sequential way (stride requests). Since such behaviors lower the ove...