Accurate performance predictions are difficult to achieve for parallel applications executing on production distributed systems. Conventional point-valued performance parameters and prediction models are often inaccurate since they can only represent one point in a range of possible behaviors. We address this problem by allowing characteristic application and system data to be represented by a set of possible values and their probabilities, which we call stochastic values. In this paper, we give a practical methodology for using stochastic values as parameters to adaptable performance prediction models. We demonstrate their usefulness for a distributed SOR application, showing stochastic values to be more effective than single (point) values in predicting the range of application behavior that can occur during execution in production environments.
Jennifer M. Schopf, Francine Berman