As distributed real-time applications gain in popularity, a key challenge is to allocate resources so that diverse realtime requirements (including non-real-time applications), distributed application components and varying workloads can all be accommodated without violating timeliness constraints. We examine the problem of resource allocation in distributed soft real-time systems, where both network and CPU resources are consumed. The timeliness constraints of applications are expressed through utility functions, which compute “benefit” as a function of end-to-end latency. We present LLA (Lagrangian Latency Assignment), a scalable and efficient distributed algorithm which maximizes aggregate utility by computing an optimal trade-off between end-to-end latency and allocated resources. The algorithm runs continuously and adapts to both workload and resource variations. LLA is guaranteed to converge if the workload and resource requirements stabilize. We evaluate the quality of re...