Nested data-parallelism (NDP) is a declarative style for programming irregular parallel applications. NDP languages provide language features favoring the NDP style, efficient compilation of NDP programs, and various common NDP operations like parallel maps, filters, and sum-like reductions. In this paper, we describe the implementation of NDP in Parallel ML (PML), part of the Manticore project. Managing the parallel decomposition of work is one of the main challenges of implementing NDP. If the decomposition creates too many small chunks of work, performance will be eroded by too much parallel overhead. If, on the other hand, there are too few large chunks of work, there will be too much sequential processing and processors will sit idle. Recently the technique of Lazy Binary Splitting was proposed for dynamic parallel decomposition of work on flat arrays, with promising results. We adapt Lazy Binary Splitting to parallel processing of binary trees, which we use to represent parallel...
Lars Bergstrom, Mike Rainey, John H. Reppy, Adam S