Garbage collection can be a performance bottleneck in large distributed, multi-threaded applications. Applications may produce millions of objects during their lifetimes and may invoke hundreds or thousands of threads. When using a single shared heap, each time a garbage collection phase occurs all threads must be stopped, essentially halting all other processing. Attempts to fix this bottleneck include creating a single heap per thread, however this may not scale to large thread intensive applications. In this paper we explore the potential of clustering threads into related sub-heaps. We hypothesize that this will lead to a smaller shared heap, while maintaining good garbage collection parallelism. We leverage results from software module clustering to achieve this goal. Our results show that we can significantly reduce the number of sub-heaps created and reduce the number of objects in the shared heap in a representative application. This suggests that clustering may be a promising...
Myra B. Cohen, Shiu Beng Kooi, Witawas Srisa-an