Abstract. Many real-time applications demand delay guarantees from the network. A network architecture designed to support these applications should be robust and scalable. The IntServ architecture provides per-flow QoS at the cost of robustness and scalability. The DiffServ architecture is robust and scalable but can provide QoS at a class level and not at a flow level. In this paper, our aim is to design architectures that are scalable and robust like DiffServ and at the same time able to provide per-flow QoS like IntServ. We propose a non work-conserving and a work-conserving architecture to achieve this goal. The guaranteeable delay regions of these architectures are the same as those of GPS based policies with rate proportional resource allocation. We also propose a scheme to provide meaningful throughput and responsiveness to best effort traffic even in the presence of heavy QoS load.