In contrast to sort-first, sort-last parallel rendering has the distinct advantage that the task division for parallel geometry processing and rasterization is simple, and can easily be incorporated into most visualization systems. However, the efficient final depth-compositing for polygonal data, or alpha-blending for volume data of partial rendering results is the key to achieve scalability in sort-last parallel rendering. In this paper, we demonstrate the efficiency as well as flexibility of the direct send sort-last compositing algorithm, and compare it to existing approaches, both in a theoretical analysis and in an experimental setting.