Prior research into search system scalability has primarily addressed query processing efficiency [1, 2, 3] or indexing efficiency [3], or has presented some arbitrary system architecture [4]. Little work has introduced any formal theoretical framework for evaluating architectures with regard to specific operational requirements, or for comparing architectures beyond simple timings [5] or basic simulations [6, 7]. In this paper, we present a framework based upon queuing network theory for analyzing search systems in terms of operational requirements. We use response time, throughput, and utilization as the key operational characteristics for evaluating performance. Within this framework, we present a scalability strategy that combines index partitioning and index replication to satisfy a given set of requirements Categories and Subject Descriptors H.3.4 [Information Storage and Retrieval]: Systems and Software – distributed systems, performance evaluation. General Terms Performance,...