We introduce a refinement strategy to bring the parallel performance analysis closer to the user. The analysis starts with a simple high-level performance model. It is based on first-order approximations, in terms of the logical constituents of the parallel program and characteristics of the system. This model is then progressively refined with more detailed low-level performance aspects, to explain divergences from a ‘normal’, linear regime. We use a causal model to structure the relations between all variables involved. The approach intends to serve as a link between detailed performance data and the developer. It is demonstrated with a parallel matrix multiplication algorithm.
Jan Lemeire, Andy Crijns, John Crijns, Erik F. Dir