In recent years, structured application-specific integrated circuit (ASIC) design style has lessened the importance of mask cost. Multiple structured ASIC chip designs share the same pre-fabricated device and wire masks. Nevertheless, the interconnection delay in a pre-fabricated wire slows down circuit performance as a result of high capacitive load. We propose a dual-rail routing architecture that reduces wire delay by 10% to 15% compared to the original routing architecture. Furthermore, we propose a dual-rail insertion algorithm to reduce routing area overhead. The experimental results demonstrate that our dual-rail technique reduces wire delay by 9.8% with 4.8% routing area overhead and improves overall circuit performance by 7.0%.