As integrated circuit technologies get smaller, circuit and architectural trends make transmitting data across long on-chip wires increasingly important yet increasingly expensive in both latency and throughput. Inserting repeaters can reduce latency by breaking up long wires with gain stages but offers only limited throughput improvement, while breaking long wires with clocked latches improves latency and throughput but requires generating fast local clocks. In contrast, asynchronous handshaking over long wires can improve both latency and bandwidth with lower control overhead. We introduce simple latency models that relate best stage separation to technology parameters. In addition, the transactional nature of handshaking presents a fundamental limitation on throughput exacerbated by long wires. We present a twin request/acknowledge control scheme that overcomes this throughput cost.
Ron Ho, Jonathan Gainsley, Robert J. Drost