One has a large workload that is “divisible” (its constituent work’s granularity can be adjusted arbitrarily) and one has access to p remote computers that can assist in computing the workload. How can one best utilize the computers? Two features complicate this question. First, the remote computers may differ from one another in speed. Second, each remote computer is subject to interruptions of known likelihood that kill all work in progress on it. One wishes to orchestrate sharing the workload with the remote computers in a way that maximizes the expected amount of work completed. We deal with three distinct problem instances. The simplest problem ignores communication costs, but considers a heterogeneous set of resources that may differ in speed. The other two problems account for communication costs, first with identical remote computers, and then with computers that may differ in speed. We provide exact expressions for the optimal work expectation for all three problems....
Anne Benoit, Yves Robert, Arnold L. Rosenberg, Fr&