Large-scale distributed systems are subject to churn, i.e., continuous arrival, departure and failure of processes. Analysis of protocols under churn requires one to use churn models that are tractable (easy to apply), realistic (apply to deployment settings), and general (apply to many protocols and properties). In this paper, we propose two new churn models - called train and crowd - that together achieve these goals, for a broad class of stability properties called quiescent properties, and for arbitrary distributed protocols. We show (i) how analysis of protocol quiescence in the train model can be extended to the crowd model, (ii) how to apply the train and crowd model to several distributed membership protocols, (iii) how, even under real churn traces, the train and crowd models are reasonably good at predicting system-wide stability metrics for membership protocols.
Steven Y. Ko, Imranul Hoque, Indranil Gupta