A multi-tenant cloud system allows multiple users to share a common physical computing infrastructure in a cost-effective way. Component sharing is highly desired in such a shared computing infrastructure, where different tenants can leverage each other’s information and expertise to fulfill their own tasks. However, it is challenging to maintain the availability of sharable component resources in a large-scale cloud infrastructure, as cloud tenants are fully autonomous and highly dynamic. In this paper, we present a novel highly available component sharing system for large-scale multi-tenant cloud systems. We describe a component availability prediction scheme to identify endangered components (i.e., components at risk of extinction) within the infrastructure. The system then performs predictive replication based on the availability prediction results to preserve those endangered components. Thus, our system can preserve the availability of all component resources with low cost. T...
Juan Du, Xiaohui Gu, Douglas S. Reeves