We present a load-balancing technique that exploits the temporal coherence, among successive computation phases, in mesh-like computations to be mapped on a cluster of processors. Our method partitions the computation in balanced tasks and distributes them to independent processors through the Prediction Binary Tree (PBT). At each new phase, current PBT is updated by using previous phase computing time (for each task) as (next phase) cost estimate. The PBT is designed so that it balances the load across the tasks as well as reduce dependency among processors for higher performances. Reducing dependency is obtained by using rectangular tiles of the mesh, of almost-square shape (i.e. one dimension is at most twice the other). By reducing dependency, one can reduce inter-processors communication or exploit local dependencies among tasks (such as data locality). Our strategy has been assessed on a significant problem, Parallel Ray Tracing. Our implementation shows a good scalability, and...