We introduce a parallelized version of tree-decomposition based dynamic programming for solving difficult weighted CSP instances on many cores. A tree decomposition organizes cost functions in a tree of collection of functions called clusters. By processing the tree from the leaves up to the root, we solve each cluster concurrently, for each assignment of its separator, using a state-of-the-art exact sequential algorithm. The grain of parallelism obtained in this way is directly related to the tree decomposition used. We use a dedicated strategy for building suitable decompositions. We present preliminary results of our prototype running on a cluster with hundreds of cores on different decomposable real problems. This implementation allowed us to solve the last open CELAR radio link frequency assignment instance to optimality.