Datapath synthesis for standard-cell design goes through extraction of arithmetic operations from RTL code, high-level arithmetic optimizations and netlist generation. Numerous architectures and optimization strategies exist that result in circuit implementations with very different performance characteristics. This work summarizes the circuit architectures and techniques used in a commercial synthesis tool to optimize cell-based datapath netlists for timing, area and power.