This paper describes an algorithm for simultaneous gate sizing and fanout optimization along the timing-critical paths in a circuit. First, a continuous-variable delay model that captures both sizing and buffering effects is presented. Next, the optimization problem is formulated as a non-convex mathematical program. To manage the problem size, only a small number of critical paths are considered simultaneously. The mathematical program is solved by a non-linear programming package. Finally, a design flow based on iterative selection and optimization of the k most critical paths in the circuit is proposed. Experimental results show that the proposed flow reduces the circuit delay by an average of 10% compared to conventional flows that separate gate sizing from fanout optimization.