Management of large-scale parallel and distributed applications is an extremely complex task due to factors such as centralized management architectures, lack of coordination and ...
In a large distributed system it is often infeasible or even impossible to perform diagnosis using a single model of the whole system. Instead, several spatially distributed local...
We present a scalable temporal order analysis technique that supports debugging of large scale applications by classifying MPI tasks based on their logical program execution order...
Dong H. Ahn, Bronis R. de Supinski, Ignacio Laguna...
We introduce the concept of “residual investigation” for program analysis. A residual investigation is a dynamic check installed as a result of running a static analysis that ...
Kaituo Li, Christoph Reichenbach, Christoph Csalln...
Background: An in-silico experiment can be naturally specified as a workflow of activities implementing, in a standardized environment, the process of data and control analysis. A...