For Video-On-Demand (VOD) systems, it is important to provide Quality of Service (QoS) to more clients under limited resources. In this paper, the performance scalability in cluster-based VOD servers is studied with several grouping configurations of cluster nodes. To find performance bottlenecks, the monitoring functions are employed and the maximum QoS streams are measured under the various requests including VCR functions. To support more user friendly interface, an embedded set-top model is suggested for the QoS of TV clients. From our detailed experiment results, a new admission control method is proposed that is based on available system resources and the actual amount of resource consumed for QoS streams. The proposed method provides not only more scalable QoS in cluster-based VOD servers but also the enhancement of resource utilization by guaranteeing the maximum number of QoS streams. Ó 2006 Elsevier B.V. All rights reserved.