This paper addresses the issue of timing driven gate duplication for delay optimization. Gate duplication has been used extensively for cutset minimization but the usefulness in minimizing the circuit delay has not been addressed. This paper studies the complexity issues in timing driven gate duplication and proposes an algorithm for solving the so called global gate duplication problem. Delay improvements over highly optimized results from SIS have been reported.