Using historical information to predict future runs of parallel jobs has shown to be valuable in job scheduling. Trends toward more flexible jobscheduling techniques such as adaptive resource allocation, and toward the expansion of scheduling to grids, make runtime predictions even more important. We present a technique of employing both a user’s knowledge of his/her parallel application and historical application-run data, synthesizing them to derive accurate and scalable predictions for future runs. These scalable predictions apply to runtime characteristics for different numbers of nodes (processor scalability) and different problem sizes (problem-size scalability). We employ multiple linear regression and show that for decently accurate complexity models, good prediction accuracy can be obtained.
Benjamin J. Lafreniere, Angela C. Sodan