Recent years have seen rapid growth of online services that rely on large-scale server clusters to handle high volume of requests. Such clusters must adaptively control the CPU utilizations of many processors in order to maintain desired soft real-time performance and prevent system overload in face of unpredictable workloads. This paper presents DUC-LB, a novel distributed utilization control algorithm for cluster-based soft real-time applications. Compared to earlier works on utilization control, a distinguishing feature of DUC-LB is its capability to handle system dynamics caused by load balancing, which is a common and essential component of most clusters today. Simulation results and control-theoretic analysis demonstrate that DUCLB can provide robust utilization control and effective load balancing in large-scale clusters.