Diverse IP cores are integrated on a modern system-on-chip and share resources. Off-chip memory bandwidth is often the scarcest resource and requires careful allocation. Two of the most important cores, the CPU and the GPU, can both simultaneously demand high bandwidth. We demonstrate that conventional quality-of-service allocation techniques can severely constrict GPU performance by allowing the CPU to occasionally monopolize shared bandwidth. We propose to dynamically adapt the priority of CPU and GPU memory requests based on a novel mechanism that tracks progress of GPU workload. Our evaluation shows that our mechanism significantly improves GPU performance with only minimal impact on the CPU. Categories and Subject Descriptors