Multicore architectures, which have multiple processing units on a single chip, have been adopted by most chip manufacturers. Most such chips contain on-chip caches that are shared by some or all of the cores on the chip. To effectively use the available processing resources on such platforms, scheduling methods must be aware of these caches. In this paper, we explore various heuristics that attempt to improve cache performance when scheduling real-time workloads. Such heuristics are applicable when multiple multithreaded applications exist with large working sets. In addition, we present a case study that shows how our bestperforming heuristics can improve the end-user performance of video encoding applications.
John M. Calandrino, James H. Anderson