This paper describes a method to optimize the performance of data paths. It is based on bit-level arithmetic transformations, and is especially suited to optimize large adder structures inside these data paths. The multioperand adders are identified at the bit level and the addition parts are merged even across operator boundaries. Area and delay optimizations use CSD coding and timingdriven transformations, including bit-slice adder trees and logarithmic addition. The method forms a link between data path optimizations at the word level and logic synthesis techniques at the bit level. Experiments show that starting from a very simple description of an NN multiplier an O(log N) delay is obtained with very low run times.