Abstract. The advent of mass-market multicore processors provides exciting new opportunities for parallelism on the desktop. The original C++CSP – a library providing concurrency in C++ – used only user-threads, which would have prevented it taking advantage of this parallelism. This paper details the development of C++CSP2, which has been built around a many-to-many threading model that mixes user-threads and kernel-threads, providing maximum flexibility in taking advantage of multicore and multi-processor machines. New and existing algorithms are described for dealing with the run-queue and implementing channels, barriers and mutexes. The latter two are benchmarked to motivate the choice of algorithm. Most of these algorithms are based on the use of atomic instructions, to gain maximal speed and efficiency. Other issues related to the new design and related to implementing concurrency in a language like C++ that has no direct support for it, are also described. The C++CSP2 libr...