We suggest that a combination of randomization and gossip communication can be used to overcome scalability barriers that limit the utility of many technologies for distributed system management, control and communications. The proposed approach can be used “directly”, but also makes possible a new kind of middleware. Broadly, we believe that these techniques enable distributed applications to achieve better resilience to stress and improved self-diagnosis and self-repair when failures or other severe disruptions occur.
Kenneth P. Birman