In a parallel, shared-memory, language with a garbage collected heap, it is desirable for each processor to perform minor garbage collections independently. Although obvious, it is difficult to make this idea pay off in practice, especially in languages where mutation is common. We present several techniques that substantially improve the state of the art. We describe these techniques in the context of a full-scale implementation of Haskell, and demonstrate that our local-heap collector substantially improves scaling, peak performance, and robustness. Categories and Subject Descriptors D.3.4 [Programming Languages]: Processors—Memory management (garbage collection) General Terms Languages, Performance
Simon Marlow, Simon L. Peyton Jones