—As modern processors are becoming increasingly complex, fast and accurate performance prediction is crucial during the early phases of hardware and software co-development. To accurately and efficiently predict the performance of a given software workload is, however, a challenging problem. Traditional cycle-accurate simulation is often too slow, while analytical models are not sufficiently accurate or still require target-specific execution statistics that may be slow or difficult to obtain. In this paper, we propose a novel learning-based approach for synthesizing analytical models that can accurately predict the performance of a workload on a target platform from various performance statistics obtained directly on a host platform using built-in hardware counters. Our learning approach relies on a one-time training phase using a cycle-accurate reference of the chosen target processor. We train our models on over 15,000 program instances from the ACM-ICPC programming contest da...
Xinnian Zheng, Pradeep Ravikumar, Lizy K. John, An