—We present an implementation of an algorithm for constructing provably fast circuits for a class of Boolean functions with input signals that have individual starting times. We show how to adapt this algorithm to logic optimization for timing correction at late stages of VLSI physical design and report experimental results on recent industrial chips. By restructuring long critical paths, our code achieves worst-slack improvements of up to several hundred picoseconds on top of traditional timing optimization techniques.